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KARL PEARSON, 1857-1957 


Being a Centenary Lecture by J. B. S. Hatpanz, delivered at 
University College London on 13 May 1957* 


We are met here to-day to celebrate the centenary of the birthof Karl Pearson. To me, at 
least, this means that I am glad that Karl Pearson was born, that I think the world is 
better because he was born. 

A greater man than any of us said 

The evil that men do lives after them 
The good is oft interréd with their bones. 

Let us begin, therefore, with some criticisms. And then let us study not only those of 
Pearson’s contributions to science and culture which are widely known, but perhaps some 
also which should be disinterred and brought once more into the light of day. 

As Pitt first stated, and Acton restated more precisely, all power corrupts. It is impos- 
sible to be a professor in charge of an important department, and the editor of an important 
journal, without being somewhat corrupted. We can now see that in both capacities 
Pearson made mistakes. He rejected lines of research which later turned out to be fruitful. 
He used his own energy and that of his subordinates in research which turned out to be 
much less important than he believed. It is, however, very easy to say what any one 
ought to have done fifty years ago! 

But this criticism can be, and has been, pushed much further. It is said that Pearson 
espoused a fundamentally false theory of heredity, and therefore of evolution, and that as 
a consequence his work was not merely useless, but actually retarded progress. Had 
Pearson become dictator of British research on heredity and evolution, this might have 
been true. Fortunately he did not. I believe that his theory of heredity was incorrect in 
some fundamental respects. So was Columbus’ theory of geography. He set out for China, 
and discovered America. But he is not regarded as a failure for this reason. When I turn 
to Pearson’s great series of papers on the mathematical theory of evolution, published in 
the last years of the nineteenth century, I find that the theories of evolution now most 
generally accepted are very far from his own. But I find that in the search for a self- 
consistent theory of evolution he devised methods which are not only indispensable in any 
discussion of evolution. They are essential in every serious application of statistics to any 
problem whatever. If, for example, I wish to describe the distribution of British incomes, 
the response of different individuals to a drug, or the results of testing materials used in 
engineering, I must start off from the foundation laid in his memoir on ‘Skew variation in 
homogeneous material’. After sixty-three years I shall certainly take some short cuts 
through the jungle of his formulae, some of which he himself made in later years. Very few 
ships to-day follow Columbus’ course across the Atlantic. 

Let me put the matter in another way. Anyone reading the controversy between Pearson 
and Weldon on one side, and Bateson and his colleagues on the other, which reached its 

* The author is quite aware that he has repeated himself in a way which would be unjustified had 
the material been put together for an article, but which is justifiab)~ in an oration. On the other hand, 


he considers that it is seldom desirable to hack what was design ' +» be spoken into a form suitable 
for reading. 
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culmination about fifty to fifty-five years ago, might have said ‘I do not know who is right, 
but it is certain that at least one side is wrong’. In fact both were right in essentials. The 
general theory of Mendelism is, I believe, correct in a broad way. But we can now see that 
if Mendelism were completely correct, natural selection, as Pearson understood it, could 
not occur. For the frequency of one gene could never increase at the expense of another, 
except by chance, or as we now put it, sampling errors. It is just the divergence between 
observed results and theoretical expectations, to which Pearson rightly drew attention, 
which gives Mendelian genetics their evolutionary importance. 

After this preamble I pass to my main task. Pearson’s connexion with this College began 
when he was nine years old, and was sent to University College School, where he remained 
for seven years. He left at sixteen and obtained a scholarship at King’s College, Cambridge, 
at eighteen. As an undergraduate he studied mathematics and was third wrangler in 1879. 
He had already shown something of his future mettle by a successful refusal to attend 
divinity lectures. In spite, or perhaps because, of this independence of spirit he became a 
fellow of King’s in 1880. He spent about a year in the universities of Heidelberg and Berlin, 
attending lectures on philosophy and Roman law as well as physics and biology. However, 
the most striking effect of his German year was to interest him in mediaeval and Renais- 
sance German literature, especially the development of ideas on religion and the position 
of women. At about this time he began to spell his Christian name with a K instead of aC. 
This may have been a homage to German culture. It may have been a special homage to 
Karl Marx, for we know that he later lectured on Marx, and his daughter tells me that when 
in Germany the police once searched his rooms, and he considered that one of Marx’s books 
was the most subversive of the documents which they found there. 

In 1880 he began the study of law in London, and was called to the bar in 1881. This 
may have been a tribute to his father, who was a Q.C., or a means of ensuring a livelihood 
in future, more probably both. He also published his first books, The New Werther* and 
The Trinity, a Nineteenth Century Passion Play. Both were anonymous, and had they 
been signed, would certainly have prejudiced their author’s chance of appointment in many 
institutions, perhaps even in the Infidel College, which suffers from occasional outbreaks 
of respectability. For both attack Christian orthodoxy. 

It was at this period of his life that he lectured on Marx to small audiences in London, 
on the ‘Ethic of Freethought’ at South Place, and to the Sunday Lecture Society on 

“Matter and Soul’. 

In 1884, at the age of 27, he was appointed to the Chair of Applied Mathematics and 
Mechanics in this College. He had only published two small papers on rather academic 
problems of applied mathematics. His first publication after his appointment was The 
Common Sense of the Exact Sciences, by his illustrious predecessor in this College, 
W. K. Clifford. His next was even more surprising. It was written in German, entitled 
Die Fronica: Ein Beitrag zur Geschichte des Christusbildes in Mittelalter, and published at 
Strassburg. So far as I know it was the first contribution made by a professor of this 
College to the history of art. It is interesting to see that he regarded this as a worthy 
topic of academic study. May I hope that, now that we have a Chair of this subject, our 
Professor may comment on Pearson’s contribution to it. 

He was clearly a very successful and thorough teacher of applied mathematics, mainly 


* For the bibliography of Pearson’s works, and for much else, I rely on E. S. Pearson’s invaluable 
memoir (Cambridge, 1938). In this lecture I have not even mentioned some of his books. 
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to students of engineering. He edited de Saint-Venant’s work on the theory of elasticity, 
and wrote the second part of Todhunter’s History of the Theory of Elasticity. His radical 
activities continued. In 1885 he joined ‘The men’s and women’s club’, a small body 
devoted to ‘the free and unreserved discussion of «ll matters in any way connected with 
the mutual position and relation of men and women’. As, in The Ethic of Freethought, 
Pearson defended the view that unmarried women should be allowed sexual freedom, it is 
not surprising that legends arose, and still exist, as to this club.* In fact Karl Pearson 
married one of its members, Miss Sharpe. To-day it is quite normal for a couple to discuss 
human sexual physiology before marriage. Seventy years ago it was regarded as grossly 
improper, and all kinds of accusations were made against those who did so. I have not the 
faintest doubt that in fact the male members of the club were far less promiscuous than 
most of their contemporaries. If to-day association with prostitutes is generally regarded 
as degrading, while seventy years ago it was generally condoned and not rarely approved, 
we owe it largely to men like Karl Pearson. 

The Ethic of Freethought was published in 1888, and is a collection of lectures and essays, 
some of which had been reprinted as pamphlets. It is, in essence, a religious book. Pearson 
defined religion as ‘the relation of the finite to the infinite’. ‘Hence’, he continued, ‘all 
systems of religion are of necessity half truths.’ The most scholarly part of the book deals 
with the history of religious systems, particularly in Germany. He believed that such a 
study was part of the duty of an educated man or woman. I read a few sentences. ‘By 
studying the past I do not mean reading a popular historical work, but taking a hundred, 
or better fifty, years in the life of a nation, and studying thoroughly that period. Each one 
of us is capable of such a study, though it may require the leisure moments, not of weeks, 
but of years. It means understanding, not only the politics of that nation during those 
years; not only what its thinkers wrote; not only how the educated classes thought and 
lived; but in addition how the mass of the folk struggled, and what aroused their feeling 
and stirred them to action. In this latter respect more may be learnt from folk-songs and 
broadsheets than from a whole round of foreign campaigns.’ 

The book is largely a record of its author’s search for truth among religious systems. 
One chapter is devoted to the mystic Eckehart, and was the first introduction of that 
remarkable thinker to the British public. Of all the systems examined there can be no 
doubt that that of Spinoza appealed most deeply to Pearson; and he devoted another 
chapter to demonstrating Spinoza’s debt to Maimonides. If I may be allowed to express 
a regret which is in no sense a criticism, it is that Pearson’s acquaintance with Indian 
philosophy was confined to translations of Hinayana Buddhist scriptures. I think that he 
would have recognized more kindred spirits in such ancient Hindu thinkers as Yajnaval- 
kya and the great anonymous humanist whose words are preserved in the first section of 
the Brhadaranyaka Upanishad. 

It is a little surprising that the title page does not mention the author’s professorship at 
University College. Perhaps his senior colleagues thought that such a mention would have 
got him into trouble. 


If, in 1890, one had had to pass judgement on Pearson, it might have run as follows, 


* [As might be expected, the critics tended to fix on only one side of the picture of the ideal re- 
lationship between the sexes in a socialist state which Pearson elaborated in his lectures on ‘‘The 


Woman’s Question” (1885) and ‘‘Socialism and Sex” (1886), afterwards published in The Ethic of 
Freethought. Ed.] 
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‘He is a first-rate teacher of applied mathematics, and a scholarly compiler of the work of 
more original men. He has a knowledge of literature and art most unusual in a professor 
of mathematics. He is somewhat of a radical, but he is only thirty-three years old. He will 
settle down as a respectable and useful member of society, and may expect a knighthood 
if he survives to sixty. He will never produce work of great originality, but the College 
need not be sorry to have appointed him.’ Had this judgement been correct, we should not 
be here to-day. 

In 1890 two events occurred which, in my opinion, shaped the course of Pearson’s future 
life. He applied for, and received, the lectureship in Geometry at Gresham College; and 
W. F. R. Weldon succeeded Lankester in the chair of zoology at University College. At 
Gresham College he could lecture on what he pleased. His first set of lectures developed 
into The Grammar of Science, his main contribution to philosophy. Later series dealt with 
‘The Geometry of Statistics’, and ‘The Laws of Chance’. But since the discussion of 
probability and statistical method in the first edition of The Grammar of Science is super- 
ficial, we may take it that in 1891 he had not considered the subject seriously. He certainly 
did so in later years. I have little doubt that the stimulus to do so came largely from 
Weldon. 

The Grammar of Science is a very remarkable book. Pearson claimed that material 
objects were merely a conceptual shorthand used to describe regularities in our sense- 
impressions. This idea is hard to develop, if only because our language is in terms of 
material objects such as eyes and brains. He did not in fact develop it without some self- 
contradiction, at least on the verbal level. But he did so, in my opinion, with much less 
self-contradiction than contemporaries such as Mach and Avenarius. He must be regarded 
as one of the founders of the important school of logical positivism. 

I can well remember the impression which his book made on me when I first read it about 
1909. If it is less impressive to modern readers, this is probably because physical theories 
have changed profoundly, a fact which would in no way have surprised or distressed its 
author. I do not personally think that Pearson’s philosophical views are correct. Never- 
theless, a man who first states an important doctrine clearly, even if it is subsequently 
rejected, is a moment in the thought process of humanity. We can best see whether Pearson 
did this by listening to the judgement of one of his adversaries. In 1908 Vladimir Ilyitch 
Lenin wrote Materialism and Empirio-criticism. This was an attack on people who, in his 
words, or rather those of his translator, ‘under the guise of Marxism were offering some- 
thing incredibly muddled, confused and reactionary’. 

Now Lenin disagreed strongly with Pearson, and claimed, in my opinion correctly, to have 
found self-contradictions in his arguments. Nevertheless, he found him vastly clearer than 
other Machians. Let me read a few of Lenin’s sentences. ‘The philosophy of Pearson, as 
we shall repeatedly find, excels that of Mach in integrity and consistency’ (p. 119).* ‘The 
Englishman, Karl Pearson, expresses himself with characteristic precision, “Man is the 
creator of natural law’’.’ (p. 221). And finally (p. 243) Lenin described him as ‘This 
conscientious and scrupulous foe of materialism’. Unfortunately, I do not know how precise 
is this translation from the original Russian. But praise of this kind from an opponent is 
in my opinion worth a great deal more than either the assent of uncritical disciples or the 
patronizing acknowledgements of successors who claim to have improved on Pearson’s 


* The references are to the pagination in Vol. 11 of Lenin’s Selected Works. London, Lawrence and 
Wishart, 1939. 
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treatment of the subject. The only other contemporary British opponent of materialism to 
whom Lenin was equally polite was James Ward. I cannot help thinking of Dante’s 
treatment of Saladin, who was, of course, in hell, but so far from suffering from heat, cold, 
or other torments, was housed in a noble castle. Whatever may be the fate of Pearson’s 
philosophy in his own country, The Grammar of Science is assured of attentive reading in 
those states where Leninism is orthodox. 

To go back to Pearson’s own views, I quote three sentences from the Grammar, which 
I think illustrate the strength and the weakness of Pearson’s approach to science. ‘The 
unity of all science consists alone in its method not in its material.’ ‘No physicist ever saw 
or felt an individual atom. Atom and molecule are intellectual conceptions by aid of which 
physicists classify phenomena and formulate the relationship between their sequences.’ 
The strength is shown by the fact that the distributions, which Pearson worked out to 
describe Weldon’s measurements of populations of crabs, will equally well serve to describe 
populations of stars, manufactured goods, durations of life, incomes, barometer readings, 
and so on. The weakness is shown by the fact that physicists have, during this century, 
seen individual atoms, or rather atomic nuclei, by the tracks which they make when moving 
rapidly. Pearson’s philosophy discouraged him from looking too far behind phenomena. 
It was, I think, for this reason, that he never accepted Mendelian genetics, although the 
Treasury of Human Inheritance and his own monograph on albinism contain plenty of 
evidence in its favour. 

His later series of Gresham Lectures dealt with statistics and probability, particularly 
with graphical methods of representing distributions. I have no doubt that they were 
written partly as a result of the questions which Weldon began putting to him soon after 
his arrival in University College. But his full answer to these questions is to be found in 
the great series of memoirs on the Mathematical Theory of Evolution which were published 
in the Philosophical Transactions of the Royal Society between 1893 and 1900. It is not 
too much to say that the subsequent developments of mathematical statistics are largely 
based on Pearson’s work between 1893 and 1903. Perhaps we shall be helped to estimate 
its importance by an exercise in hypothetics. What would have been the effect on Pearson 
had Bateson obtained the Jodrell Chair of Zoology in place of Weldon? And what would 
have been the effect had our College contained an economist or engineer interested in what 
is now called Quality Control? Although Bateson was as interested as Weldon in animal 
variation, he was more concerned with exceptions, and with discontinuous, or as Pearson 
and Lee called it in 1899, exclusive inheritance. I doubt if Bateson would have put his 
questions in a form which would have aroused Pearson’s interest. If he had done so, they 
would probably have discovered what is now called Mendelism. For Bateson, before 
reading Mendel’s paper, did not realize the necessity of dealing with large samples, which 
Pearson certainly did. 

If an economist or technologist had interested him in the variation of manufactured 
goods, he would have had to deal, as he did, with skew variation. He would presumably 
have used correlation to measure the likeness between the products of the same craftsman 
or machine as he in fact used it to measure the likeness between the children of the same 
parents. Perhaps in 1901 he might have founded a journal T'echnometrika not wholly unlike 
Biometrika. He would almost certainly have inven‘ed some, at least, of the statistical 
methods now used in industrial quality control. He might perhaps have added 1 % or so 
to the industrial productivity of Britain in the early years of this century. 
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The papers to which I refer are hard to read because Pearson reached his conclusions by 
algebraical and arithmetical methods which are now seen to be needlessly laborious. Many 
of them have since been simplified. As a humble tribute to Pearson I have, as I believe, 
simplified the first of them, which deals with the dissection of a skew frequency distribution 
into two normal distributions. By an elementary transformation I have thrown his rather 
formidable nonic equation into a form which allows numerical tabulation, and this 
tabulation is now under weigh in the electronic laboratory of the Indian Statistical 
Institute. I hope that as a result, the method will be available to statisticians less pertina- 
cious than Karl Pearson. 

Commenting on a particular passage in (I think) one of Beethoven’s works, a German 
musical critic remarked ‘Hier ist Titanenthum Pflicht’. (Here titanicity is a duty). Karl 
Pearson attacked Olympus by piling Ossa on Pelion rather than by seeking an easy path. 
If we, his successors, have made statistical theory relatively easy, and much of Pearson’s 
mathematics are no longer used, we should remember that we are treading in the footsteps 
of an intellectual titan. : 

The germs of many later developments in mathematical statistics are to be found in these 
papers. Thus, in Contribution III to the theory of evolution, Pearson discussed ‘the best 
value of the correlation coefficient’ based on a given sample. He decided on the value which 
maximized the chance of obtaining the observed sample. This method was developed by 
Edgeworth as ‘the method of maximum credibility’, and by Fisher as ‘the method of 
maximum livelihood’. In the succeeding paper, with Filon, Pearson developed it further. 
Critics have asked why he did not generalize it. I think one possible answer is as follows. 
The expression ‘the best’ is unfortunately seldom applicable to statistical estimates. The 
best for one purpose is not usually the best for another. I think Pearson realized this. 
Some of his successors have not. 

In 1900 Pearson attacked the problem of curve fitting. Having fitted the best available 
curve to a series of data, for example, the numbers of human beings whose heights were in 
intervals such a. 70-71 in., he asked what was the probability that a sample from a 
population truly represented by his curve should fit it as badly as, or worse than, the sample 
in question. If the chance was 1 in 3, there is no good reason to doubt the validity of the 
theory on which the curve is based. If the chance was 1 in 300, the theory is almost 
certainly wrong, though it may be a useful approximation for some purposes. But the 
question arises ‘what is a bad fit?’ Is 38 a worse fit to an expected number of 30 than 4 to 
an expected number of 10? (It is not!) And how are we to combine these in an overall 
estimate of badness of fit? Pearson solved this problem by the invention of the function of 
observations called y?, which increases as the fit becomes worse. 

This has turned out to be an immensely powerful tool, and is used on a huge scale. To 
take one example, in the last number of the Journal of Genetics, at least fifty-three values 
of y? were calculated by three different authors. But now comes the curious and character- 
istic fact. None of these authors used x? as a test of curve-fitting, and it is very rarely so 
used. It is used as a test of agreement with hypothesis wherever the hypothesis is tested 
by counting individuals. And it is used, as Pearson pointed out that it might be used, to 
discover whether a number of sets of data agree with the same unknown hypothesis. Foz 
example, if the total of a number of families contains about 17 % of a particular type we 
may have had no reason beforehand to expect 17 % rather than any other frequency. But 
we can use x” to determine whether some of the families have a proportion which diverges 
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more from 17 % than could reasonably be accounted for by chance. If not, but only if not, 
we can justifiably pool the data. In this case y? is said to be used as a test of homogeneity. 

This is, of course, a commonplace with great human achievements. The wheel was 
invented for use in chariots, carts, and so on. But to-day most wheels are used, not for the 
support of vehicles, but for power transmission. Perhaps the majority of wheels in England 
are inside watches. It was absolutely characteristic of Karl Pearson that his intellectual 
inventions were ofter. extremely general. He obtained a solution of a problem which was 
of such generality that it had entirely unexpected applications. 

But for this very reason it often had a 'imited applicability to the problem for which it 
was originally designed. In the last few years many experiments have been done on 
artificial selection of quantitative characters, particularly in Drosophila and mice. Their 
results during the first few generations are often much as Pearson would have expected. 
But after this they diverge very greatly. In spite of this they are best described by the use 
of the mathematical tools which Pearson first applied to such problems, that is to say by 
describing changes in the moments of character distributions, and simple functions of them 
such as standard deviations and correlations. One can only defeat Pearson intellectually 
with the weapons which he himself forged. {f I may be allowed to quote William Blake,* 
Pearson’s main service to humanity was 


In all his ancient strength to form the golden armour of science 
For intellectual war. 


About the same time he began not only to use data collected by Galton and others, on 
man and other animals, but to collect his own. Among the important biological results of 
this period were the demonstration that fertility is inherited both in our own species and 
in race-horses. As an example of his thoroughness I mention his measurements of the same 
human bones after various periods of wetting and drying, which never changed their length 
by as much as 1%, though they did change it. 

It was probably through Weldon that he came to know Galton. This very remarkable 
man had, among other things, invented the recognition of criminals by finger prints, and 
psychoanalysis (as may be seen from pages 185-207 of Inquiries into Human Faculty). 
Much of Pearson’s work in the ’90’s was a development of the notions used by Galton in 
his Natural Inheritance in 1889. However, there is no reason to think that Pearson’s one 
serious excursion into practical psychology owed anything to Galton. This is described in 
a pair of papers published in 1899 and 1902 alleged to be on the mathematical theory of 
errors of judgement, but in fact incorporating a series of measurements made on the same 
material by Dr Alice Lee, Dr Udny Yule, and Pearson, and by Lee, Dr Macdonell, and 
Pearson. Each observer had, of course, a characteristic bias and a characteristic spread 
round the mean. But what was utterly unexpected was the discovery that the errors made 
by two observers varied in Pearson’s words, sympathetically. In fact in one series, Lee and 
Macdonell showed a high correlation, Pearson being independent. He attributed this to 
‘the influence of the immediate atmosphere’. Others might have attributed it to telepathy. 

Some of his finest work at this time was with Lee, Bramley-Moore and Beeton on the 
inheritance of human fertility and longevity. I cannot say more for the value of this work 
than that I could find no better data on which to base a theory which I published in 1949, 


* Vala, or The Four Zoas (End of last Night). 
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and which I venture to think explains some results which Pearson found surprising at the 
time, though, of course, he published them, and did his best to explain them. 

In 1901 the first volume of Biometrika was published, partly no doubt because the Royal 
Society, although it had awarded him its fellowship in 1896 and its Darwin medal in 1898, 
objected to publishing advances in mathematical and biological knowledge in the same 
paper! Biometrika has not fulfilled what Galton, in its first number (p. 9) stated to be the 
primary object of biometry, namely ‘the discovery of incipient changes in evolution which 
are too small to be otherwise apparent’. The reason for this failure is simple. The mean rate 
of increase in tooth length during the evolution of the horse since the Eocene is now known 
to have been about 4 % per million years. Such evolution could not be detected in a human 
lifetime. But the aims stated in the editorial introduction, presumably the joint work of 
Pearson, Weldon and Davenport, were fulfilled. In particular the first number contained 
a paper by Weldon on variation in snails whose importance he did not live to realize. He 
found that natural selection in a snail species weeded out extremes, reducing the standard 
deviation of a metrical character without affecting the mean. We now know that this 
centripetal selection is very common. Had Weldon lived longer he would presumably have 
discovered this, and the whole history of biometry would have been very different. 

In 1903 Pearson’s Department received a grant of £500 from the Drapers’ Company, and 
these grants, at the rate of £500 per year, continued till 1932. In 1903 this sum was worth 
about £3000 or more to-day, and went partly in the payment of Dr Lee and other com- 
puters, partly for instruments, and partly for printing. 

I have no idea how Pearson obtained this money. We may be sure that he did not either 
flatter rich men or promise to improve the national health and intelligence in their lifetimes. 
Perhaps Galton had the ear of some rich acquaintances. Perhaps too, at that time our 
ruling classes were less permeated than now with the ferocious contempt for the pursuit of 
knowledge for its own sake, which is voiced in the Archbishop of Canterbury’s sermon of 
March 24, 1957. To-day it is not hard to get money for research which may have economic, 
military or hygienic advantages. It is extremely hard to do so for the mere search for truth. 

About this time Pearson began the series of papers on human biology for which he is best 
known in some quarters, and the majority of which, I think, were joint work. Even where 
his name did not appear on papers, I think our chairman will agree that nothing was 
published from his laboratory without his imprimatur, and some of such work must at least 
briefly be considered here. 

Many of these papers are as fresh to-day as when they were written. To take an example, 
in my opinion nothing since written on human craniometry has in anyway superseded 
Pearson’s and Davin’s great memoir of 1924. Some of this work was, at the time, of 
inestimable value. I think particularly of the Treasury of Human Inheritance. This is still 
indispensable. Nevertheless, we now know that it is possible to distinguish between 
conditions (for example, haemophilia, Christmas’ disease, Owren’s disease, and so on) 
which were inevitably classed under a single category by the writers of the Treasury. The 
more polemical writings of this period are of less value to-day. as Pearson doubtless realized 
when he called a series ‘Questions of the Day and of the Fray’. The Fray in question was a 
many-sided contest. On the one hand, Pearson and his colleagues attacked those who 
underestimated the importance of heredity, including those who exaggerated the harm done 
by parental alcoholism. But they also attacked those who oversimplified it, including 
many Mendelians and many eugenists. Other attacks were on statistical data alleged to 
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prove the value of immunization to diseases. These attacks were fully justified. Mental 
defect is certainly not a Mendelian vharacter. As Pearson and Jaederholm showed, the 
distribution of intelligence quotients in defectives is the tail of a nearly normal frequency 
distribution. Forty-three years later we can say a great deal more about it. We can say, 
for example, that phenylketonuria is a chemically definable character inherited as a 
Mendelian recessive, and accounting for perhaps 1 % of certifiable mental defect. But the 
mental defect of phenylketonurics is graded, and a few of them are stupid, but not 
sufficiently so to be classed as feeble-minded. In fact the diagnosis of phenylketonuria 
enabled Penrose to dissect the distribution of human intelligence quotients into two very 
different but still overlapping distributions. The notion that such a dissection is possible 
was Pearson’s first contribution to biometry. 

Again, one series of memoirs was entitled Studies in national deterioration. This is a 
polemical title. And it is a fact that as regards most measurable characters the nation has 
not deteriorated. It may have done so as regards its ‘nature’ or inborn capacities. I think 
that if Weldon had lived Pearson would have realized the ubiquity of centripetal selection, 
and that in fact both the most successful and the least successful members of society were 
breeding more slowly than those a little below the median. It is, however, easy to be wise 
after the event. Moreover, Pearson and his colleagues were completely right in one respect. 
Even if, in spite of his predictions, the nation has improved in some measurable directions, 
it would have improved more if, say, a million children who were born to unskilled labourers 
had been born to skilled workers, teachers, and the like. 

No such criticism is possible of the mathematical tables which he edited, and in whose 
computation he played a large part. Their utility was well shown by the fact that ‘pirated’ 
editions of them were soon published in America. The subsequent development of statistics 
is largely based on them. Even the advent of the electronic computer has not yet super- 
seded them. They were published from 1914 to 1934, and the Tables of the Incomplete 
Beta-function, published in Pearson’s seventy-eighth year, were his last, but not his least, 
contribution to science. It appears that no one has yet discovered how to use an electronic 
computer as efficiently as Pearson used his teams of devoted, painstaking, and remarkably 
accurate, lady assistants. 

In 1911 Galton died, and left funds for the endowment of a Chair of Eugenics, of which 
Pearson became the first occupant. At last he was able to give up the teaching of applied 
mathematics to engineers and physicists, and in the next year the present laboratory was 
begun. Fortunately it was completed by 1914 though it was commandeered as an annexe 
to the hospital, and he did not get into it till after the war. From 1914 till 1919 he did very 
little but war work, first for the Board of Trade and later on calculation of trajectories of 
artillery. When in 1920 the Department of Applied Statistics was formally opened, he was 
sixty-three years old, and he had, among other things, to develop a new course of lectures 
and practical work. In 1923 he began a series of papers which combined biometrical and 
historical research. He was able to measure the skulls of a number of distinguished men 
and compare them with contemporary portraits. In :‘\ course of this work he played the 
detective, and reconstructed the murder of Lord Darnley, second husband of Queen Mary 
of Scotland; and his comments on the history of the Reformation in Scotland are well 
worth reading. To the same period belongs his great life of Galton, which involved much 
historical research. 

I have devoted this lecture entirely to Pearson’s published works. If he could hear it I 
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believe his main criticism would be that I have said too little about his fellow workers; for 
much of his work was in collaboration. He had a wonderful gift for inspiring loyalty in his 
colleagues, of which more will be said to-day by others. I myself only met him frequently 
in the last years of his life, and can merely say that he was most gracious to me, though 
my outlook on many biological questions was very different from his own. He resigned his 
Chair in 1933, but published one book and at least three scientific papers before his death 
in 1936. 

It remains to say a few words about what Samuel Butler would have called his life after 
death, the results which are still accruing from his original thought. To begin with, all 
subsequent statistical work is based on the foundations which he laid. If we sometimes 
find it more convenient to speak of a variance ratio where he would have used a coefficient 
of correlation this does not mean that his work on homotyposis is obsolete. If his system of 
frequency distributions is less used for biological data than he might have hoped, yet 
Gosset, Fisher and others have found that they describe exactly the distributions of many 
statistical estimates based on finite samples. 

At University College his work is being carried on by three professors.* Under his son the 
Department of Statistics has become the leading teaching department in that subject in 
Britain, and the new Biometrika Tables, to take only one example of its work, continue his 
father’s great tradition. Prof. Fisher, who succeeded him as Professor of Eugenics, did very 
great services to statistics in simplifying and rendering more accurate a number of statistical 
procedures ; and, by the application of methods which owed much to Pearson’s great memoir 
on homotyposis, made agricultural experiment an exact art. To mention only one of Fisher's 
contributions to eugenics, he established a laboratory for human serology, and his interpreta- 
tion of the human Rh antigens has had a considerable influence on the prevention and cure 
of a very serious human congenital disease. His controversies with his predecessor were 
perhaps inevitable between two men each so determined to defend the truth as he saw it. 
Under Prof. Penrose the Department has swung back towards Pearsonian methods. If I may 
mention two researches which I believe would have delighted Pearson particularly they are 
the work of Karn and Penrose on infantile mortality as a function of birth weight, which 
measured natural selection in man with an accuracy which he would have envied, and that 
of Penrose on abnormality in the offspring as a function of parental age and maternal 
parity, a subject which Pearson had broached in his 1914 memoir on the handicapping of 
the first-born. 

My own department of biometry has not been so fortunate. I was Professor of Genetics 
till 1937. [should not have accepted the Weldon Chair had I not been promised accommoda- 
tion for Biometrical work. Owing to the war, and for other reasons, this promise was not 
kept. I have been unable to carry out the duties of this chair adequately. In my opinion the 
best biometric work of the last twenty years has been carried out by Teissier and Schreider 
in France, and by Mahalanobis and his colleagues in India. In afew months in India I have 
been able to start new lines of biometrical research. I think particularly of the work of 
S. K. Roy, now I hope in press, in which he took up the problem of homotyposis where 
Pearson left it in 1903. A study of some 60,000 flowers from three different plants (to be 
compared with Pearson’s 4443 capsules from 176 poppy plants) has shown that individual 
plants not only have their characteristic means, but their characteristic standard devia- 


* Perhaps I should also include the Professors of Astronomy and Art History, and those of 
Engineering. 
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tions, and that both of these alter in a characteristic manner during a season. I believe that 
the opportunities for Biometric research are now better in India than in Britain, and for 
this reason among others I have thought it my duty to migrate there. To quote Karl 
Pearson’s most loved poet 

In Vishnu-land what avatar? 
Whatever the fate of Pearsonian biometry in Britain, I believe that it will live and flower 
in India. 

To me at least there seems to be an element of hypocrisy about the present celebrations. 
I believe that we should be honouring Pearson more effectively if, to take one example out 
of many possible, we ensured that the College Library possessed copies of all his works, 
placed where students could consult them, than by making speeches and eating food. 
I mention this particular example as I have been trying, without the faintest success, to 
secure such accessibility for at least ten years. 

Pearson’s work for free thought and the emancipation of women has been successful in 
this country, if not always as quickly so as he hoped seventy years ago. The history of art 
is now taught in this College. His work for socialism has not been as successful here as he 
hoped. Nor would he have approved of many features of the socialistic systems of the 
Soviet Union and China. Here again I believe his real heizs are to be found in India, where 
the editor of Sankhya, the Indian Journal of Statistics, is also the principal planner of the 
approach to socialism under the second Five-Year Plan. 

I fully realize that I have not done justice to my subject. The task set me was an 
impossible one. No one man now alive could do justice to the breadth of Karl Pearson’s 
interests and achievements. But I thank you for joining with me in celebrating the 
memory of this great man. 
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AN ANALYSIS OF THE DATA FOR SOME EXPERIMENTS 
CARRIED OUT BY GAUSE WITH POPULATIONS OF THE 
PROTOZOA, PARAMECIUM AURELIA AND 
PARAMECIUM CAUDATUM 


By P. H. LESLIE 
Bureau of Animal Population, Department of Zoological Field Studies, Oxford 


1. IytTRoDUCTION 


In his book, The Struggle for Existence, Gause (1934, pp. 96 et seq.) describes a very interest- 
ing set of experiments which he carried out with populations of the Protozoa Paramecium 
aurelia and P. caudatum. The growth in numbers of these populations was observed when 
each of the species was living alone, and also when both were living together in a constant 
volume of nutrient medium. This consisted of 5-0¢.c. of Osterhout’s balanced physiological 
salt solution in which was suspended a standard measured amount of a freshly grown 
culture of Bacillus pyocyaneus, the latter serving as a food supply for the Protozoa. This 
medium was freshly prepared each day, and it is stated that in this solution no multiplication 
of the suspended bacteria takes place. Each set of experiments was replicated either three 
or four times, and the cultures were kept in a moist thermostat at 26°C. At the end of every 
24hr. a sample was taken from each culture in order to count the number of individuals 
present; and after centrifuging down the remaining individuals, the old medium was 
removed and replaced with the same volume of a freshly prepared suspension of bacteria. 

This series of data is therefore of some interest from the point of view of population 
mathematics. For essentially, what we are given here are the changes in the numbers of 
two species, either living alone or competing together in an environment of fixed size, and 
under as constant a set of physical conditions and amount of food as the experimental 
procedure admitted. The two species observed are of a comparatively simple biological 
type and, under these experimental conditions, can be regarded as multiplying merely by 
division of the individuals. This greatly simplifies any mathematical model we adopt, 
since the complications due to age-specific birth and death rates and the age structure of 
the population can be neglected. Moreover, since each set of experiments was replicated, 
we are able to estimate, in addition to the mean values of these stochastic processes, the 
degree of variance between replicates at some time, or over some particular period in the 
development of the populations. In the case of these data, however, any estimate of this 
inter-replicate variance will have at least two components. There is, in the first place, the 
variance which is to be expected between a set of comparable stochastic processes at some 
given time; and, in addition, there is the composite variance due to the sampling of each 
population as a whole, and to the method of counting the individuals in the samples. 

The data for the individual replicates of P. awrelia and P. caudatum, when each was living 
alone, and of the mixed cultures, are given by Gause in the appendix to his book (1934, 
Table 3, pp. 144-5). (It should be noted that each entry in this table must be multiplied 
by a factor of ten in order to find the estimated total numbers in each culture as a whole.) 
Each population of a single species was started with 20 individuals (or 40 in the case of the 
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mixed cultures) and the total numbers thereafter increased until there might be around 
5000-6000 individuals in the case of some cultures. There was, however, a considerable 
degree of variation between the replicates at each of the successive sample censuses. 

Evidently we are dealing here with a set of experimental populations in which the numbers 
are not particularly small. We might assume, therefore, as a first approximation that the 
changes in the mean values of these stochastic processes could be described adequately in 
terms of a simple deterministic model; and the object of the present paper is to examine 
these data in the light of this hypothesis. 


2. MATHEMATICAL MODEL 


The mathematical model which will be used in the analysis of these data is the familiar one 
associated with the names of Lotka (1925, 1932) and Volterra (1926, 1931). Suppose we 
have two competing species of living organisms, S, and S,, whose populations consist 
respectively of NV, and N, individuals at any given moment. If either of these species were 
living alone, under some constant set of physical conditions, such as the temperature, 
relative humidity, the hydrogen ion concentration of the medium, and so forth and no 
limitations whatsoever were placed upon its increase in numbers; and if the amount of 
some particular source of food supply is unlimited; then we assume that the species 
would increase in numbers at a rate defined by 


- =rN = (b-d)N, 

where r is the intrinsic rate of increase, representing the difference between the ‘birth-rate’, 
b, and ‘death-rate’, d, of the species under the given conditions. In a finite environment, 
however, when some source of food supply is assumed to remain constant in amount, an 
approach to this intrinsic rate of increase would only be made when the number of in- 
dividuals is relatively small, and we suppose that any increase in numbers will tend to have 
an adverse effect on the relative rate of increase of the population. Hence, when limitations 
either of space, and/or food supply, are placed upon the increase in numbers, we have in 
general for a particular species living alone 





dN , 
where the function F is subject to the condition, 
dF 
an < 0. 
As a first approximation, the simplest function which fulfils these conditions is evidently 
F(N) =r-aN, 
where ais a positive parameter; so that we have the well-known logistic differential equation 
dN . 
7 == (r —aN )N, 
: ‘ N K 
whence, by integration, t= T7Oer? 


where K = r/a, and C is a constant defining the initial state of the system. 








316 Experiments by Gause on Paramecium aurelia and P. caudatum 


When the two species S, and S, are competing together in a limited environment for some 
common food supply remaining constant in amount, we suppose that any increase in 
numbers of the species S, will tend to have an adverse effect, not only on its own relative 
rate of increase in numbers, but also on that of the species S,; and, similarly any increase 
in numbers of S, will have an adverse effect both on its own relative rate of increase and 
on that of the species S,. We have, therefore, the perfectly general equations describing 
the interaction between two competing species, under some constant set of physical and 
environmental conditions, 





N, dt _ FN, N2), 

dN, 

Ndi ~ F(N,, Np), 

where the functions F, and F, are subject to the conditions, 
am) am 
oN, ON, 
< 0, <0. 

oF, |“ oF 
ON, ON, 


For our present purpose it is not necessary to discuss the properties of this general system 
of equations. Clearly, by an extension of the argument used above for the case of a species 
living alone, the simplest forms of the functions F, and F, which fulfil these conditions, is 
a set linear in the variables N, and N,. We write, therefore, 


,y = (r,—a,N, — 5, N,)M, 


where 7, and a, are the logistic parameters for the species S,, if it were living alone, and 
similarly r, and a, those for the species S,; while the positive parameters b, and b, measure 
the degree to which each species affects the relative growth rate of the other. In general, 
all these six parameters are unequal, in which case there is no explicit form of solution of 
this set of differential equations; but, when they are not all unequal, various special cases 
may arise, one of which appears to be applicable to the present series of data. 


3. APPLICATION OF THE MODELS TO THE PRESENT DATA 


In applying this model to the results of Gause’s experiments, there is, however, one im- 
portant point to be borne in mind. Hitherto, in developing this model, we have supposed 
that the species S, and S, were living, either alone or together, in some ideally isolated 
system, abstracted from the rest of nature, and that the activities of the observer had no 
effect on the growth in numbers of the populations. But, in these experiments, every day 
before the nutrient medium was changed, each culture was carefully stirred and a sample of 
0-5¢.c. (one-tenth of the total volume) was removed, and the number of individuals in it 
was counted. Having been counted, this sample was destroyed (Gause, 1934, p. 97). Thus, 
the activities of the observer, in this case, were equivalent to those of an intermittent 
predator on the populations of Paramecia, resulting in the sudden destruction at the end 
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of every 24hr. of—on the average—one-tenth of the total number of individuals present 
at that time. 

If we define n, as the number of individuals counted in the sample of 0-5c.c., then the 
estimated total number of individuals in each culture, at the time it was removed from the 
incubator, is 1, = 10m, 


After renewal of the medium, each culture when it was returned to the incubator therefore 
on the average contained, at the beginning of the interval ¢ to t+ 1, 


N;, = 0-9N, 


individuals. We assume, in the case of each species when living alone, that during this interval 


t to ¢+1, the population was growing in numbers according to the logistic differential 
equation dN 


, = (r—aN)N, (3-1) 


and we require estimates of the parameters r and a, given the values of n, at successive 
intervals of time. 
From (3-1) we have by integration 


K - 
N= Tyoen (K =a); (3-2) 
whence (K —M)/N, = Ce; - (3-3) 


. K 
and at time ¢+1 Nia = i¢Genan 


Substituting (3-3) in this last equation, and writing 
A=é, 


we have, after a little rearrangement, the following equation relating the number of in- 
dividuals at the end of the interval ¢ to t+ 1 with the number at the beginning, 


AN, L- 
Mui = 1+aM,’ 
where a= (A-1)/K. 


Since we have defined the number of individuals at the beginning of each interval as 
N; = 0-9N, owing to the destruction of a tenth of the population in the sample discarded 
after counting, we have for the successive intervals, 


AN; 
Nua = T+aN’ (3-4) 
where Nj is the number of individuals with which the cultures were started. From (3-4) 
we have +x 
See ee " 
NW. stim) | (3-5) 


a linear relationship from which A and a can be determined, and hence the parameters 
rand K = r/a of the logistic equation. 
It will be noted that (3-4) can also be written as 
_ AN 
t+1 ~ 1 +a’N, ? 
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where A’ = 0-9A and a’ = 0-9a; and N, = N4/0-9. Thus, since N, = 10n,, if we work in terms 
of the numbers recorded by Gause, each multiplied by a factor of ten (although this is not 
really necessary), the estimated total numbers of individuals in the population, allowing 
for the activities of the observer, also follow a logistic curve defined by 


K’ 


Re ocmice 
t 14+C’e 


(K’ =1'/a’), (3-6) 
provided we adjust the initial numbers by dividing them by a factor of 0-9. The relationship 
between K’, r’ and the K, r of (3-2) is given by 
; ,, (0-9e"-1)K 
, = 0-9 ene 
r =r+log,(0-9), K 0-9(er—1) 

Thus, in the case of these experiments, there are, when each species of Paramecium is 
living alone, two logistic equations to be considered. There is, in the first place, equation 
(3-2) which in terms of the model describes the expected average change in numbers of the 
population when no activities of the observer affect it in any way; and secondly, we have 
equation (3-6) describing the interaction between the population and the observer, in so 
far as the latter is acting as a predator on the numbers of the former. We may work in terms 
of either of these equations, transforming the estimated values of the parameters from the 
one into the other whenever necessary. 


4. EXPERIMENTS WITH EACH SPECIES OF PARAMECIUM LIVING ALONE 


The data for these experiments are given by Gause (1934, Table 3, pp. 144—5), in the form 
of the number of individuals per 0-5 c.c. of culture. Each entry in this table thus represents, 
presumably, the result of counting the number of individuals in the sample of 0-5c.c. 
removed each day from a total volume of 5-0c.c. It is, however, not at all clear whether the 
entire sample was always counted, since the occurrence of either zero or five in the last 
figure of the more dense populations suggests that some form of subsampling was employed 
in doing the actual counts, more particularly when the populations were approaching their 
upper asymptote in numbers. It appears from this table that the first sample census was 
taken when each replicate was 2 days old, and that thereafter each surviving replicate was 
sampled daily until the nineteenth day after the start of the experiments. From day 20 
until day 25, when the series ends, there are no records for the individual replicates, but 
it is stated in the text (p. 99): ‘A separate counting of the number of individuals in every 
culture was discontinued from the twentieth day, and we began to take average samples 
from similar cultures.’ Working from an origin of time, therefore, we have the following 
information for making any estimates of the variance between replicates. 


| No. of 


Type of culture | replicates Days 
P. aurelia alone | 3 | 2-19 | 
P. caudatum alone 2-13 


or 
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The values of the parameters in the logistic equations for the two species were estimated 
from the mean values (7;) of each set of replicates, as given by Gause in his table, by means 
of the equation (3-5), viz. 

Nt — 1 <3 & N; 

Mu ATA 
where N, = 107, and N; = 9%, for t = 2,3,4,...,25. No census apparently was taken at 
t = 1; but initially Nj = 20 in both cases, and in order to retain this information in the caleu- 
lations, the ratio ,/(20/N,) was taken as an estimate of N4/N,. This is quite a close approxi- 
mation when N, and N, are small compared with the upper asymptote in numbers. Writing 


NiI/Na=y, N,=2, 
we have the linear relationships between these variables 
(y—Y) = b(x—2). 

There is here a question as to the best method of estimating the parameter b = a/A. We 
are not regarding, in the usual way, y as a dependent and x as an independent variable, for 
which we require an estimate of the regression coefficient b of y on x. Moreover, both these 
variables are subject to error. Actually, in the present instances, a method suggested by 


Rhodes (1940) for estimating the parameters of a logistic by means of a very similar type 
of linear relationship was adopted, and the value of the parameter b was taken to be 


using the entire data recorded by Gause from day 0 until day 25. As a result the following 
estimates of the parameters A and a were obtained. 








| 
A a 
| 
| P. eurdia 24905 0-00026195 
P.caudatun | ~—-22042,—Ss|_——(0-00058189 
| | | 
| | | 








These numerical values of the parameters A and « were then inserted in equation (3-4), 
and starting at t = 0 with Nj = 20 individuals in each case, the expected total number of 
individuals at time t, and hence of the number in the sample removed and discarded at 
that time, were calculated by means of a repeated application of the equation. (Since no 
sample was taken at ¢ = 1, we have Nj = N, in both cases.) The expected mean number of 
individuals (7,) in the sample of 0-5c.c. are given in Table 1, together with the observed 
mean numbers from t = 2 tot = 19. The calculations were stopped at this point because this 
was the last day for which the data for the individual replicates were recorded, and also 
because it was evident that the upper asymptote in numbers for the theoretical curves was 
attained by this time. 

A test of the ‘goodness-of-fit’ of these calculated curves can be made by comparing the 
pooled inter-replicate variance between days 2 and 19 with the mean square deviation 
between expected and observed in Table 1. There are, in each case, eighteen observations 


21 Biom. 44 
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from which the latter can be calculated, and the inter-replicate variances can be based, 
respectively, on 36 and 48 degrees of freedom. If we were to apply the usual variance-ratio 
test of significance, then taking as the number of degrees of freedom, n, = 18 and n, = 36 
or 48, the 5 % points of F would be approximately 1-89 and 1-81. But, it is not at all clear 
whether we would be justified in applying this test to the present data. Although the suc- 
cessive daily samples withdrawn from each population are independent, the number of 
individuals in a particular population observed over a period of time are likely to be serially 
correlated. A replicate, for instance, which is greater than the mean at some given time, is 
likely also to be greater than the mean on the next day. Leaving aside this point, however, 
a rough test of the goodness-of-fit in the present examples would be to regard as unsatis- 
factory any ratio between these variances which was greater than two. 

Because the magnitude of the figures recorded for the individual replicates differed quite 
considerably, not only between themselves at some given time, but also as the populations 
increased in numbers, the calculations were carried out in terms of the logarithms (to the 
base 10) of the numbers observed. Estimating the pooled inter-replicate variance in the 
usual way, by eliminating the variance between days, the following values of s? were 
obtained for the period of time 2—19;* 





D.F. a 
| 
P. aurelia | 36 0-005825 
P. caudatum 48 0-01657 








Thus, since in Table 1 the mean numbers of P. awrelia are each based on three replicates, 
we should expect the mean variance between log,, expected and log,, observed to be around 
s? = 0-005825/3 = 0-00194, if the calculated curve is a good fit to that observed. In the 
case of P. caudatum, the first twelve observations are based on four replicates, and the 
remainder on three replicates, and thus we should expect a mean variance of around 
s? = 0-00460. Then, taking the figures in Table 1, and defining the deviation, 


A = log,, observed — log,, expected, 


* An analysis was also made of the eighteen daily inter-replicate variances for both species. Applying 
Bartlett’s test for the homogeneity of variances, the corrected y? was for P. aurelia, 15-70 and for 
P. caudatum, 21-08, there being 17 degrees of freedom in each case. Assuming that this test is applicable 
to the present data, these would be regarded as perfectly satisfactory values of y?, and we would conclude 
that there was no evidence of any heterogeneity in the daily variances. This test, however, does not take 
into account the serial order of the observations, and there was a suggestion that the larger estimates of 
s* tended to occur in the earlier part of both series. When the individual variances were ranked in order 
of magnitude from large to small, and contrasted with the time order, 1, 2, 3,..., 18, in which they were 
observed, the rank correlation (tau) was for P. aurelia, T = +0-42, and for P. caudatum, tT = + 0-61. 
Both of these rank correlations are significant (P = 0-02-0-01, and P <0-001, respectively). Thus, 
there is strong evidence that working in terms of log,,.;,, the variance between replicates is some function 
of time. It might be held, therefore, that the use of the pooled inter-replicate variance, as a measure of 
the ‘goodness-of-fit’ between expected and observed, could be misleading. However, as we shall only 
compare this pooled variance with the mean square deviation between expected and observed over the 
same time period, this objection is perhaps not very serious. Any agreement between the two figures 
would suggest that, if there were any bias in the fitting, the degree of this bias was not very great. 
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we have the following mean square deviations, ZA*/18, contrasted with those expected 
from the inter-replicate variances. 








| LA2/18 | Expected s? | 
| ey | 
| | 

P. aurelia 000246 0-00194 


| P. caudatum | 0-02738 | 0-00460 | 





Table 1. The observed and expected mean number of individuals (7,) in 0-5c.c. of 
culture for populations of Paramecium aurelia and P. caudatum living alone 

















P. aurelia P. caudatum 
Time | | 
(days) | l i. ae 
_ Observed | Expected | Observed Expected 
ee ees ‘AS cae | 
| | | 
2 e 12-2 10 | 9-4 
3 ah 266 2 | 10 | 17-7 
4 56 | 56-0 | 11 | 32-1 
5 94 110-9 21 54:5 
6 189 | 197-1 | 56 | 84-1 
7 266i 3016 | 104 115-8 
8 330 | 395-1 | 137 | 143-0 | 
9 416 | 0 4585 | 165 | 1622 | 
| | 
10 507 | 493-9 | 194 174-0 
il | 580 | 511-5 | 217 | 180-6 
12 | 610 | 519-7 199 184-1 
13 | 513 | 523-5 201 | 186-0 
14 | 593 525-2 | 182 | 186-9 
15 | 557 526-0 192 187-4 
16 | 560 526-3 | 179 187-6 
17 | 522 526-4 | 190 187-7 
18 | 565 5265 | 206 187-8 
19 517 526-6 209 187-8 








Clearly, in the case of P. aurelia, the mean square deviation is of much the same order as 
that expected; while for P. caudatum it is markedly excessive, being very nearly six times 
greater than s*. By inspection, it is clear from Table 1 that the major portion of this dis- 
crepancy is due to a marked divergence between ‘expected’ and ‘observed’ for days 4 and 
5. For day 4 the expected numbers are nearly three times greater than those observed, and 
for day 5, 2-6 times. Obviously, as may be seen by graphing the figures in Table 1, no sx ple 
function such as the logistic could be expected to fit the recorded mean numbers of caudatum 
over the entire development of these replicates. However, apart from the entries for days 
4and 5, it appears that the fit for the remaining entries is reasonable, for we have for the 
remaining 16 days ZA?/16 = 0-00658, and the expected s? = 0-00404. 
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Thus, excluding the results for days 4 and 5 in the case of P. caudatum, and judging by 
the criterion adopted, the overall discrepancy between the calculated and observed curves 
is in both cases much what one might expect from the degree of variation between replicates, 
There is, however, one further point which should be noted. If we examine the signs of the 
differences, ‘observed’ — ‘expected’, in Table 1, there is a suggestion that in both cases 
negative signs tend to occur more frequently in the earlier part of the growth curves, and 
positive signs more frequently in the later stages. In other words, these logistic curves may 
tend to overestimate the population in the earlier stages of the growth in numbers, and 
underestimate it in the later stages. The degree of this bias is, however, slight, and con- 
sidering that the logistic is being applied merely as a first approximation, these curves are 
probably as good a fit to the observed data as might be expected from the use of such a 
simple model. 


5. GROWTH IN MIXED CULTURES 


When both species of Paramecium are living together in the same microcosm, we consider 
the pair of differential equations 


aN, . 
—) = (7; —a,N, — 5, N,) M, 
(5-1) 


oy = (12 — a_.N,—b.N,) Np, 


in which r, and a, are the logistic parameters for the species 8S, when it is living alone, and 
similarly r, and a, those for the species S,. The positive parameters b, and 6b, are a measure 
of the effect which each species has on the relative growth rate of the other. 

Now, in the very simplified experimental conditions in which these two closely related 
types of Paramecium were living, we might assume that the magnitude of the effect which 
P. aurelia had on the relative rate of increase of P. caudatum was much the same as that 
which it had on its own relative rate of increase. And similarly for the effect of P. caudatum on 
P. aurelia in the mixed cultures. In other words, we might assume that we are dealing with 
a special case of (5-1) in which we have, 

b, =a, b,= a). 
dN, dN, _ 
N,dt N,dt ~ 
M(t) _ N,(0) 
N,(t) N,(C) 


Thus, if our assumptions are correct, two consequences should follow. 


Then, by subtraction, 11 —1o, 


e1—r)t (5:2) 








and integrating,* 


(1) If we define S, as the species having the greater intrinsic rate of increase (r, > 12), 
then as too, the numbers N, of the species S,>0, and ultimately the species S, will 
persist alone. 


(2) If we take the natural logarithms of the ratios N,(t)/N,(t) and plot these against time 
as the independent variable, then the successive points should fall on a straight line, and 
the slope of this line should be the same as the difference between the intrinsic rates of 
increase of the two species when each was living alone. 


* For footnote see opposite page. 
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In the previous section we have estimated that the intrinsic rate of increase (r = log, A) 
of P. aurelia, when living alone, was r, = 0-9125; while for P. cawdatum, r, = 0-7904; the 
difference between these estimates being 7, —1r, = 0-1221. We therefore define P. aurelia as 
the species of S, and P. caudatum as the species S,, and we should expect that the latter 
would tend gradually to disappear from the mixed cultures. It is apparent from the data 
given by Gause (Table 3, p. 144) that this was the case in these experiments. In order to 
see whether the second consequence of our assumption is fulfilled, we must estimate the 
difference (r, — 7.) from the data for the mixed cultures. Since the same proportion of the 
total population of each species was removed and discarded at each sample census, we may 
work in terms of the mean number of individuals (7,) per 0-5 c.c. of culture given by Gause. 
— on so log, 7 (t) — log, n(t), 
it will be found that the relation between the successive y, and ¢ is approximatel? linear, 
and fitting a straight line 

y=kt (k=1r,-12) 
passing through the origin (since 7%,(0)/7.(0) = 1 is given), the regression coefficient k was 
estimated from the 24 points available from day 2 to day 25 as 


k = 0-1132 + 0-0034. 


This value is remarkably close to the difference r, — 7, = 0-1221 which was estimated from 
the data for each species when living alone. 


* This special case of the differential equations (5-1), when 6, = a, and b, = a,, was discussed by 
Volterra (1926) in his original memoir. If we substitute (5-2) for the terms in N, and N, in thefirst and 
second members, respectively, of (5-1), then the resulting equations can be integrated, and N, and N, 
can be expressed as functions of time. Thus, if we put 


Nyi=a2, K,=7;,/a,, 
k=r,-T12 
Nzt=y, Ky=72/d, 
these integrals can be written as, 
a(t) = Kr1+Ky!C-1e-*4+ Of et, 
y(t) = Ky1+Kyz1C e+ Oye, 
where C, Cj and C3 are constants defining the initial state of the system. 


These equations, however, are of little use in the present problem, since we also have to allow for the 
activities of the observer. In order to apply them to Gause’s data, we must express them in the form 


N,(t¢+1) = fa(Nx(t), N2(t)) (a = 1,2). 


This can be done by substituting from x(t) in 2(¢+1), and from y(¢) in y(¢+1), and using (5-2). If we 
write for simplicity, 
A, =e, A, =e", 


&, = (Ay—1)/Ky, a = (A,—1)/Ky, 


we have finally, after a little rearrangement, 








i ALN y(t) 
arn +a,N,(t)+a,N,(t)’ 
o A,N;(t) 
NA+ = Ty aN) +e, 0)" 


which is the system of equations used later in § 6. 
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The observed ratios of the mean number of individuals counted in the samples from 


the mixed cultures are given in Table 2, together with those expected from the equation 


7 (t) = e01132t, 


a(t) 


In order to see whether the departure from expectation was at all marked, the same pro- 
cedure was adopted as in the previous section, and the pooled inter-replicate variance in 


Table 2. The observed and expected values of the ratio of the mean numbers, 
Paramecium aurelia/P. caudatum, growing in mixed culture 


Ratio 7,(t)/%,(t) 

















Time | Time | d 
(days) | Observed Expected (days) Observed | Expecte 
| 
.w om “Hirai e Scere ante aheg ‘7 Sa VT 
2 1-00 1-25 14 5:78 | 4-88 
3 1-91 1-41 15 6-44 | 5-46 
4 | 2-00 | 1-57 16 6-60 | 6-12 
5 | 1-84 1-76 17 | 8-07 | 6-85 
6 | 2-30 | 1-97 18 | 7-46 7-67 
| 7 | 1-60 | 2-21 19 | 6-55 | 8-59 
| 
| . | 1:78 | 2-47 20 7-00 9-62 
| 9 3-15 2-77 21 8-25 10-77 
| 10 | 2-95 | 3-10 22 | 17-50 12-07 
Y 4-59 3°47 23 | 17-50 | 13°51 
12 3-64 3-89 24 | 9-43 | 15-13 
13 6-18 4-36 25 17-50 | 16-94 








Expected = ¢%1132¢, 


terms of log,,), was estimated from the data for the mixed cultures. These variances were 
P. aurelia 0-01319, P. caudatum 0:02423, 


while the covariance was 0-0007267, each of these estimates being based on 36 degrees of 
freedom. Clearly, the correlation (r = + 0-041) between the numbers of the two species, 
when the time trend is eliminated, is negligible. Thus, since each %, is in both cases based 
on three replicates, we should expect the variance of 


Y = log, 7, (t) —log, n,(t), 
to be given by 


var (y,) = $(23026)? (0-01319 + 0-02423) = 0-06614. 


From the fitting of the regression line passing through the origin, the discrepancy between 
observed (y) and expected (Y) was, for the 24 points, 


s* = X(y— Y)*/(n—1) = 0-06213, 


which is in close agreement with that expected from the variance observed between the 
replicates. 
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6. A FURTHER TEST OF THE HYPOTHESIS 


There is another way of testing whether this simplified model of the competition equations 
is a satisfactory description of these data. In §4 we estimated the parameters A and « in 
the logistic equation, 

AN; 


h welke 
we Sane 


from the data for each species when living alone. When, in the special case of (5-1), we have 
b, = dg, by = ay, there is an analogous set of equations when the two species S, and S, are 
in a state of competition, viz. 


= ee 30% 8 A, Ni(t) Ferns Al 
M+) = TaN) +0408) | 
) (6-1) 
_ ae | 
MTN) = TaN) +0, NG)’ 


where A, and «, are the logistic parameters for the species S, when living alone, and similarly 
A, and «, those for the species S,. In both equations we have 


Ni(t) = 0-9N,(t) and N,(t) = 0-9N,(6), 


in order to allow for the removal of one-tenth of the existing population whenever a sample 
census was taken. Then, given the numerical values estimated in § 4 for 


P. aurelia, A, = 2°4905, a, = 0-00026195, 
P. caudatum, A, = 2°2042, a, = 0-00058189, 


we can calculate, given that Nj(0) = N3(0) = 20, the average numbers of individuals in 
the mixed cultures at successive intervals of time, and therefore the expected mean number 
of each species in the samples which were removed and discarded from day 2 onwards.* 
These expected mean numbers can then be compared with the mean numbers actually 
observed by Gause in the mixed cultures. This is quite a stringent test of the model and of 
the simplifying assumption, since any error in the estimates of the parameters made from 
the data for each species when living alone would have a cumulative effect in any such step- 
by-step calculation, and the estimates of the expected mean numbers would tend to diverge 
more and more from those actually observed. 

The results of this calculation are given in Table 3 from day 2 until day 25 in the experi- 
ments. It will be seen from Table 3 that in the case of P. aurelia the expected numbers 
follow closely those observed until about the sixteenth day, but that, certainly after day 19, 
they then tend to diverge from the latter. On the other hand, the expected curve for 
P. caudatum appears to give a reasonably good fit to the observed data throughout the 
entire period. The discrepancy between the observed and expected numbers is of much the 
same order of magnitude as that which might be expected from the inter-replicate variances. 


* In carrying out this step-by-step calculation, it must be remembered that since nosample was taken 
at t= 1, we have Ni(1) = N,(1) and N4(1) = N,(1). 
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Working in logarithms to the base 10 as before, and defining A as the deviation between 
log ‘observed’ and log ‘expected’ numbers in Table 3, we have 





Species 





P. aurelia 


P. caudatum 





Period 
(days) 





ZA?2/n 


0-00511 
0-00724 


0-00775 
0-00995 


s* expected from 
inter-replicate | 
variance | 


0-00440 


0-00808 








Table 3. The observed and expected mean number (7,) of individuals in 0-5 c.c. of culture 
in the mixed populations of Paramecium aurelia and P. caudatum 











| P. aurelia 
Time | 
(days) | 
Observed Expected 
7 eattek |- Seam 
| 2 | 10 11-7 
3 21 24-5 
4 | 58 47-9 
5 | 92 84-7 
6 202 131-8 
7 | 163 179-9 
s | 221 221-0 
9 | 293 253-2 
10 | 236 278-4 
| 11 | 303 | 299-1 
12 | 302 317-1 
| 13 | 340 333-5 
| 14 | 387 | 348-8 
| 15 | 335 363-2 
| | | 
16 363 376-8 
17 323 | 389-7 
18 | 358 | 401-7 
19 | 308 413-0 
20 | 350 423-5 
21 | 330 | 433-2 
22 | 350 | 442-2 
23 350 450°5 
24 | 330 | 4581 
25 | 350 465°1 
| 








P. caudatum 
Observed Expected 

10 | 9-2 

11 17-0 | 

29 | 29-4 

| 50 46-0 | 
88 | 63-3 
102 76-5 
124 83-2 
23 84-4 
80 | 82-1 
| 66 | 78-1 
| 83 73-3 
55 68-2 
67 | 63-1 
52 58-1 
55 | 53-3 
40 | 48-8 

| 48 44:5 
47 | 40°5 
50 36-8 
| 40 | 33-3 
| 20 30-1 
| 20 ~—CS 27:1 
| 35 | 24-4 
| 20 | 21-9 
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It is quite evident that up to day 19, in the case of both species, the discrepancy between 
the figures in Table 3 is no greater than that which might be expected from the degree of 
variation between the three replicates on which the means are based. Even up to day 25 
the average overall agreement is reasonable; but, as has been mentioned earlier, it is 
evident by inspection that in the case of P. aurelia, from about day 16, the expected and 
observed curves are diverging. However, considering the approximate nature of the para- 
meters estimated from the data for each species when living alone, it is remarkable how 
closely we have been able to anticipate, as it were, the results actually observed in the 
experiments when both species were living together in the same microcosm. 


7. CONCLUSION AND SUMMARY 


It seems reasonable to conclude, as a result of this analysis, that the deterministic model 
associated with the names of Lotka and Volterra is a remarkably good description of the 
changes which occurred in the mean values of these stochastic processes. The two main 
points leading to this conclusion are 


(1) The first approximation in this model for a species living alone—namely the logistic 
equation—is a good fit to the data for P. aurelia, as judged by a comparison of the mean 
square deviation between expected and observed with the variance between replicates. 
In the case of P. caudatum this equation was not a good fit; but this was due almost entirely 
toa relatively large discrepancy between expected and observed on 2 days. If these were 
neglected, the fit of the logistic to the remaining observations for this species would be 
reckoned as satisfactory. 


(2) Given the logistic parameters for the two species when living alone, and assuming 
that a special case of the competition equations in this model were applicable to these 
closely related types of Protozoa, it should theoretically be possible to calculate the expected 
numbers in the sample censuses when the two species were competing together for a common 
food supply. The agreement between the expected numbers calculated in this way and 
those actually observed was very much what might be expected from the variation between 
replicates in the competition experiments. 
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ON THE DISTRIBUTION OF TRIBOLIUM CONFUSUM 
IN A CONTAINER* 


By D. R. COX anp W. L. SMITH 


1, INTRODUCTION 

Neyman, Park & Scott (1956) have recently discussed a number of statistical problems 
arising from studies of the flour beetle, T'ribolium. One experiment they describe was as 
follows. A container was filled with fresh flour occupying a volume of a cube 10 x 10 x 10in. 
On the surface of the flour a large number of Triboliwm confusum beetles was placed. The 
container was kept for about four months in a dark incubator maintaining an approximately 
constant temperature and humidity, and was periodically rotated. After the 4 months the 
contents of the container were divided into 1000 cubes 1 x 1 x 1 in. and the contents of each 
small cube were lifted separately and examined. 


Table 1. Distribution of Tribolium over a plane cross-section 
(averaged experimental data) 














First quarter Second quarter | 
cata ) wEKS : | | 
| | 32 | 365 40 | 6-4 12-3 | 
| 25 | 18 2-0 37 | 143 | 

| | | 
35 | 50 42 | 68 | 118 
| 18 13 | 22 | 38 12-3 
| | | | | 
| ; 40 | 42 5-2 80 | 116 
| Third quarter 2-0 9.9 | 9-1 | 40 12-5 
| 6-4 6-8 80 | 92 | 138 | 
| 3-7 33 40 | 72 18-6 
| | 
128 | 118 116 | 13:8 | 17-9 
143° | 123 | 125 | 186 | 30-4 








Obtained from the data quoted by Neyman ei al, (1956) by the averaging process described in the 
text. The upper figure is the density of females, the lower figure the density of males. The average 
density is symmetrical in the square and only the distribution in the fourth quarter is shown (note that 
the averaged distribution is symmetrical also about the diagonal of the whole square). 


The general feature of the experimental results is, on any given ‘layer’, a gradual increase 
in the density of the beetles towards the sides of the container, with the density highest in the 
corners. Table 1 is a contraction of Fig. 10 of Neyman et al. (1956) obtained by arranging 
their data on the assumption that, apart from random fluctuations, the density of beetles 
has the full symmetry properties of the square. Thus, the value given in the right-hand 
corner of Table | is the average of the four ‘corner’ values given by Neyman et al. (1956). 


* Work done at the Statistical Laboratory, University of California, Berkeley, partially supported 
by funds provided under contract with the Air Research and Development Command, USAF School of 
Aviation Medicine, Randoiph Field, Texas. 
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The distributions for males and females are somewhat different in that the density for 
females falls steadily towards the centre of the square from a relatively low maximum atthe 
corners,while the density for the males fails more sharply from a high maximum at the corner 
and is substantially constant in the central portion. The general feature of a gradual increase 
in density towards the edges and along the edges towards the corners is, however, the same 
for both sexes. 

After discussing some possible models to explain these phenomena, Neyman et al. (1956) 
concluded that, up to the time of their writing, no model producing the gradual increase in 
density was available. The object of the present note is to discuss a simple probability model 
that does produce a density distribution which is qualitatively similar to the one observed. 


2. PREVIOUS WORK 


Broadbent & Kendall (1953) had some success in describing the two-dimensional motion of 
Trichostrongylus retortaeformis by simple Brownian motion. Sherman investigated the con- 
sequences of such a model in the present context, and showed that a random walk over 
a square lattice with inelastic boundaries could produce a concentration of beetles at the 
boundaries and in the corners, but could not account for the gradual increase in density 
towards the edges. In a further investigation (Sherman, 1956) of the one-dimensional ran- 
dom walk, he showed that distributions more like the observed one could be obtained if 
a suitable boundary condition was inserted. This was that after striking the boundary, the 
moving point remains there a randomly distributed time, and is then placed instantaneously 
a finite distance within the region of motion. Sherman did not, however, claim thi: boundary 
condition to be a reasonable explanation of the experimental results. 


3. PossIBLE MODELS 


One approach is to assume that the motion of the beetle is described by a more complicated 
random walk whose steps are not infinitesimal, and may indeed be correlated in direction 
and length; see, for example, the type of walk investigated by Daniels (1952). Suppose that 
this is combined with a boundary condition of the following type. On meeting the boundary 
the beetle remains there for a time-interval which has a certain frequency distribution, and 
it then pursues a path that is the reflexion in the boundary of the path that would have been 
followed in the absence of the boundary. 

It may be shown by the method of images that for any such walk, without drift, which 
leads in the case of unrestricted motion to an asymptotic normal distribution of indefinitely 
increasing dispersion, the limiting distribution within the square boundary is uniform except 
for a line concentration on the boundary. This result applies also for such a two-dimensional 
motion within a circular barrier, and is probably true for a very wide class of boundaries. 

One objection to the Brownian motion models is that the path of the beetles is highly 
irregular, as though the beetle were constantly forgetting its direction. This objection does 
not apply to the generalizations just discussed; the failure of the generalizations to produce 
the required behaviour must be connected with the use of an inappropriate boundary condi- 
tion. Indeed, the assumption of perfect reflexion seems a most unrealistic description of the 
true behaviour of T'riboliwm. 

In the present note we shall show that by taking a more realistic boundary condition, even 
with a grossly oversimplified description of the beetles’ paths within the flour, we can obtain 
a distribution of beetles of the type observed. 
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4. GENERAL DESCRIPTION OF THE MODEL 


The basic assumption is that the motion is two-dimensional and that within the flour 
a beetle follows a straight path. When it meets the boundary it may (i) with probability p, 
return along its original path; or, (ii) with probability i — p, move a distance 7, possibly zero, 
along the boundary before choosing a new direction of motion, independently of its previous 
direction. 

The distance 7 and the angle @ that the new direction of motion makes to the normal to the 
boundary at the point of departure are assumed to have distribution functions H(7) and 
G(@). We assume for simplicity that the beetle moves with constant speed, although this 
assumption may be greatly generalized without affecting the final results. 

In spite of the simplicity of the model which we have just described, we have been unable 
to obtain an explicit solution to the problem of limiting beetle distribution within a square 
boundary. We shall, therefore, discuss first the solution for a circular container. 


5. SOLUTION FOR A CIRCULAR CONTAINER 


Assume that the motion takes place within a circle of unit radius. Let y, denote a concentric 
circle of radius r < 1. Let Q denote a path, defined to be the total track of the beetle between 
two successive selections of a new direction of motion within the flour, i.e. a path consists of 
one chord of the circle, possibly traversed several times, followed by an arc of the circum- 
ference of the unit circle. 

Consider a large number of such paths developed according to the distributions H(r), 
G(@) defined above. The probability that a beetle is inside y, is then the total length of all 
parts of paths Q within y,, divided by the total length of all paths. Thus, it is easy to see that 
the probability I(r), say that a particular beetle is within y, is given by 


_ mean length of that part of a path which is in y, 
it mean length of a path 





I(r) 


me oe, say. (1) 


This argument assumes that the spatial distribution of the beetles comes to statistical 
equilibrium and that probabilities and time-averages can be equated. These points can be 
justified, and (1) derived directly, by the theory of regenerative stochastic processes 
(Smith, 1955). 

The problem therefore resolves itself into the geometrical question of the value of p(r). To 
calculate p(r) note from Fig. 1 that the chord length in y, is 2,/(r?—sin? 4) and that, after 
making allowance for the probability of traversing the chord several times, 


2 sin~'*r 
p(r) = ard (r2 — sin? 0) dG(0). (2) 
1—pJo 
2 
Similarly, A=T+ i= | (r? — sin? 0)td@(0), (3) 
—PJo 


where 7 is the mean of the distribution H(7). 
To evaluate these integrals, we shall write 


G'(0) = a+b|sin@ | +csin?0; (4) 





It 


ex 


(3) 


(4) 
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where higher terms could be calculated, if required. Write 


K(r) -{"a —r®sin? 0) 6, (5) 
0 


K(r) = fa —r2sin? 0) d0, (6) 


for the complete elliptic integrals of the first and second kinds, respectively. Also put 


4 t* dt 
ave pied... ' 

ul) =), (8) FP] - 
It is then easily shown from (4) that 


2 
pr) = = falOg(r)— Ogr)] + br{O4(r) — Oglr)] + er*[04(r) — Oulr (8) 








Fig. 1. A typical path. 


The functions O,(r) satisfy simple recurrence relations (Byrd & Friedman, 1954) and for 
k = 0 through 4 they are given by 





1 fl+r?, 1+ 
On(r) = K(r), Ox(r) = 3 aa ptt-r|, 
:, k+ 1 
Ox(r) = x logs, Oxlr) = gaU(2+r*) K(r) — 21 +99) E(r)]. 


Ox(r) = 51K(r) - Br), 


From these results we can find I(r) in terms of tabulated functions. A more useful 
expression is for the density of beetles per unit area as a function of r. This is given for r < 1 by 


7. 1 
~ Qnr 


@(r) 





II’(r). (9) 


We obtai 
e obtain aK(r) + 4b log +" +c[K(r)—E(r)] 


) = [aa +b+ e+ 71 Py 





(10) 
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The simplest, and physically most plausible distribution in the family (4) has b = c = 0, 
a = 2/m, i.e. is the rectangular distribution for 6. For this distribution 


io rae 
27 : + m=? 


A short table of K(r) is given in Table 2 and shows that the density per unit area increases 
steadily from the centre. In the third and fourth columns of Table 2 are given the functions 
which determine the density in the cases a = c = 0 and a = b = 0, i.e. when the density 
function of @ is proportional to | sin @| and to sin? 6. 





(r <1). (11) 


Table 2. Density per wnit area (unstandardized) for some special cases 




















| } 

-_ K(r) log a | K(r)—E(r) 
0-0 1-57 0 0 
0-2 1-59 0-41 0-03 
0-4 1-61 0-85 0-13 
0-6 1-75 1-39 0-33 
0-8 2-00 2-20 0-72 
0-9 2-28 2-94 li | 
0-95 2-59 3-66 49 
1-00 foe) co co | 








In addition to the continuous distribution over the interior of the unit circle, there is a line 
density of probability on the circumference of amount per unit length 
1 T(1—p) 
HNO) = 37 a+b + e+ 7p) = 
The general conclusions to be drawn from these results are as follows. If the distribution 
of 7 is uniform, i.e. if the beetles select paths ‘randomly’ oriented, then the resulting density 
is nearly constant for r < } and increases steadily to infinity asr tends to 1. If the probability 
density of @ is proportional to | sin @ |, i.e. if there is a fairly strong tendency to select paths 
nearly tangential to the boundary, the resulting distribution of beetles is naturally much 
more strongly concentrated at the larger values of r, rising from a zero density at the centre. 
Some distributions of # showing a slight tendency to concentrate on the tangential paths can 
be obtained by suitable values of a, b and c. The resulting distributions of beetles are then 
given by a weighted combination of the columns of Table 2 and therefore have densities 
increasing with r. Certain distributions of 0 attaching relatively greater probability to the 
paths normal to the boundary can be represented by negative values of b and c, provided only 
that (4) is always positive and integrates to unity. In these cases the form of (7) depends, of 
course, on the relative magnitudes of the coefficients. 


6. SOLUTION FOR THE SQUARE 
As we have remarked above, the mathematical treatment for the square container is much 
more difficult than for the circular container. This is primarily because it is first of all 
necessary to find the distribution of starting points of paths along the edges of the square. 
This distribution is determined by an integral equation which we have been unable to solve. 
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Instead of an exact mathematical treatment of the case of a square container we have been 
obliged to introduce approximate methods. We assumed that paths may start and end only 
at a discrete set of points along the sides of the square. We have made the further assump- 
tions that @ is uniformly distributed and that there is no motion along the boundary, 
7=0. By a tedious procedure it is possible to work out the equilibrium distribution of 
beetles over the square. A brief outline of the steps in these calculations follows. 

(i) Eight boundary points were symmetrically placed at distances }, 3, ...,42 from the 
ends of the sides of a square of side 8. These points were then numbered 1, 2, ..., 8 cyclically. 

(ii) Transition probabilities giving the probability that a path starting from a point j on 
one side would strike one of the other sides within a distance } of a point k were calculated on 
the basis of a uniform distribution for 6. It was then assumed that any path, which ended 
within a distance } of a point k, actually ended at k. 

(iii) Symmetry considerations show that the final distributions can be obtained by 
working with paths starting from points 1, 2, 3, 4 only, on one side. Therefore, a Markov 
chain with four states was involved and a system of linear equations determined the 
equilibrium probability that a path starts from a position with a specified number. 

(iv) The probability attached to each possible path was then calculated using (iii) and (ii). 

(v) The interior of the square was divided into suitable areas and the length of each path 
within each area tabulated. When multiplied by the correct probability from (iv) these 
lengths gave the numerator in the expression analogous to (1). The mean length of all paths 
was determined in a similar way, and hence the equilibrium probabilities of finding a beetle 
in the various areas of the square were determined. 


Table 3. Equilibrium density of beetles over a square, for a simple model 








0-0167 0-0183 0-0227 


First quarter Second quarter | 
00131 | 0-0125 00135 | 00176 | 
ia 00125 | 00139 =| «O-0144 | 00167 | 
q 0-0135 | o0o144 | 00144 | 00183 
| 00176 | 
| 








The probabilities found in this way are shown in Table 3. The distribution is only given for 
the fourth quarter of the square because of the symmetry involved. 

Notice that the general features of Table 3 are the same as those of the experimental 
results in Table 1. That is, the density is fairly constant in the centre of the square and rises 
steadily towards the sides. Along each edge the density rises to maxima in the corners. 

However, there are two discrepant features. First, there are puzzling minor irregularities 
in the computed values in Table 3. For example, the density in the centre of the square is 
slightly higher than that below or to the side of the centre. There are three possible explana- 
tions of these irregularities. 

(i) They may represent real features of the random motion, in the sense that they would 
be present in the solution of the correct ‘continuous’ model where the paths are not restricted 
to a discrete set. 

(ii) They may be the result of numerical errors that have escaped the various checking 
schemes used. 
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(iii) They may be a consequence of the ‘discrete’ approximation adopted. Thus, we have 
replaced the whole set of paths starting from points in an interval on one side of the square, 
and going to points in an interval on another side of the square, by one path. This single path 
will, in general, contribute zero probability to regions that, in the continuous model, would 
receive appreciable probability. While this unbalance should tend to even itself out where 
there are many paths, it is reasonable to expect some irregularities to remain. 

We are of the opinion that (i) and (ii) are unlikely, and that (iii) is the correct explanation. 

The second discrepancy is that the magnitudes of the trends in Table 3 do not agree too 
well with the experimental values. The corner density in Table 3 is less than twice the density 
in the centre, whereas the corresponding ratios for the experimental densities are about 
4 for females and 15 for males. The theoretical trend along the edge from the centre is about 
correct for females and somewhat too small for males. Some of this lack of agreement could 
be accounted for by the kind of assumption, made in §5, that a beetle which has reached the 
boundary tends to remain there for some time before choosing a new path within the flour. 


7. A MORE REALISTIC MODEL 


The major artificiality in the model just analysed is the assumption that the paths are 
straight in the interior of the container. A more realistic model would be the following. At 
the beginning of a new path from the boundary the beetle chooses a starting angle 0 as in the 
earlier model. However, let it now be assumed that in an element dx of the path there is 
a constant chance ~éx + 0(éx) that a change of direction will occur. Also, suppose that if 
a change does occur then it is uniformly distributed over (0, 277). The beetle then moves 
along in the new direction until either it meets the boundary or a further change in direction 
occurs. Such a model would have three adjustable features. 

(i) The parameter ~ which determines the average length of straight sections of path. 

(ii) The distribution of boundary starting angles @. 

(iii) The magnitude of any tendency to stay near the boundary for some time before 
choosing a new path. 

It should be possible to obtain a mathematical expression for the way the equilibrium 
density of beetles depends on these features. However, we have been unable to obtain such 
an expression. Certain qualitative properties of the model are more or less obvious. If jv is 
small we have nearly the case which we have discusse in the main part of the present paper. 
If wis large we have very nearly Brownian motion, leading to a uniform distribution, except 
for a line concentration in the boundary. Moreover, it is fairly clear what the general effect 
of changes from a uniform distribution of 9 would be. If values of 6 near 47 were relatively 
frequent the density near the edges of the square is increased relative to that at the centre. 
A tendency to remain stationary near the boundary would also, obviously, increase the 
density in the regions adjacent to the boundary. 

It seems plausible that a good fit to both sets of data in Table 1 could be obtained by 
suitably adjusting the three features listed above, and moreover that a reasonable fit is 
possible even with the simple uniform distribution of #. This would explain the experimental 
results without appeal to a special hypothesis that the beetles’ motions are different near to 
and distant from the surface. 
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THE xy? GOODNESS-OF-FIT TEST FOR NORMAL DISTRIBUTIONS 


By G. 8. WATSON 


Australian National University 


SUMMARY. Chernoff & Lehmann (1954) have shown that if, in making a yx? goodness-of-fit test of a 
continuous distribution, the parameters are estimated efficiently from the sample and not from the 
cell frequencies, the statistic X** is asymptotically distributed as 
Neer tAryit-..+A,y?, where O<A,,...,A,<1, yy, ---5 Ys 

being independent standard normal variables. They used & fixed class intervals, defined without 
reference to the sample. In the present paper, the particular case of the normal distribution is ex- 
amined when the class intervals are chosen to contain, with reference to the sample mean and variance, 
constant probabilities and so vary with the sampling. The X? statistic is again found to be distributed 
in the Chernoff & Lehmann form (s = 2). Explicit formulae are given for A, and A, which tend to zero 
rapidly as k increases. It is suggested that at least ten class intervals be used in practice so that the 
tabular points of x? may be used with an error of less than 1 %. 


1. INTRODUCTION 


Fisher (1924) first established the fundamental theorem of y? goodness-of-fit tests—that 
the X? statistic is asymptotically distributed as y*, the degrees of freedom being the number 
(k) of classes less one less the number (s) of parameters estimated. Cramér’s well-known 
text (1946) gives the most rigorous proof available but, like Fisher’s argument, it is formu- 
lated for a multinomial situation, i.e. there are k classes with probabilities p,(0,, 4s, ...), 
1 = 1,..., and the observations consist of class frequencies 1, N», ..., 2, LN, = N. Provided 
the parameters admit regular efficient estimators (which are functions of 7,, no, ...,,) 
the result follows. 

In fitting a continuous distribution with density f(x; 0,0, ...), k class intervals (—00, Z,), 
(Z,, Ze), ---» (Z,_1, 00) could be used. Then 


Z 
(O45 pp +++) = “ fla; 04,05, ...) dee. (1) 
I-1 
Suppose 7, 7, ...,% are the observed frequencies in these intervals. Then, if Z,, Z,, ..., Z,_4 
are fixed, and if the class frequencies rather than the sample values 2,, 79, ..., zy are used 
efficiently to estimate 0,,0,,..., Fisher’s result may be used, provided Cramér’s conditions 
are satisfied. 

It is, however, probably universal practice in this situation to use the sample values 
X14, Xp, ...,Xy to provide efficient estimates b,, 6, ose OF GG, .02: 

These are not the estimators envisaged above which are functions of the cell frequencies. 
They contain more information and thus are more efficient. Fisher, in his 1924 paper, 
showed that if, in the multinomial problem, estimators were used with efficiencies differing 
from the efficiencies of the maximum likelihood (or minimum ,?) estimators based on 
frequencies, the statistic would not have the y? distribution. Neyman & Pearson (1928, 
1931), and Sheppard (1929) examined the situation again and showed that X? is larger than 
Xji.-s-1- The detailed solution to the problem was first given by Chernoff & Lehmann (1954). 
For the problem of testing the goodness-of-fit of a continuous distribution, they assumed 


* Following Cochran (1952), we use X* to denote the statistic X(f,—f,)?/f, and chi-square (or y*) for 
the well-known distribution (or a variable with this distribution). 
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that the cell boundaries Z,, Z,, ...,Z,_, are fixed and 6, 0,, ...,0, were estimated efficiently 
from the sample ,, 2X, ..., 2 and their result showed that X? was distributed as 


a # Aiyi Tee +A,¥;, 


where A,,...,A, are certain latent roots, all in the interval (0,1) and y,,...,y, are standard 
normal variables, independent of each other and of x%_,_,. This shows explicitly why X? 
is greater than y%_,_,—the fact observed by the earlier writers. They applied their 
general theorem to the normal distribution and showed that the numerical effects may be 
important, e.g. that the significance level may well be twice the tabular value. 

Since, with Chernoff & Lehmann’s formulation, the distribution of X? depends on the 
unknown parameters, it is hard to see the practical significance of their result. If, however, 
the cell probabilities p,,p.,...,; are prescribed and the class intervals chosen so that, 
always, ‘“ 

Pr= | fle; 91,0...) de, (2) 
2-1 
then it will be shown below for the case of the normal, that the distribution of X? does not 
depend on the values of 6,, 6,, but it is still not that of y*; in fact, it is a special case of that 
given by Chernoff & Lehmann. The relevance of this formulation—fixed cell probabilities 
rather than fixed cell boundaries—to practice is not immediately obvious. Before defending 
it, a survey of current practical methods is necessary. 

Even after restricting discussion to the normal distribution, it is not easy to define the 
current practices for the choice of class intervals. The most usual method may be seen in 
the examples of Cramér (1946, §§ 30-34). The intervals, except for the two extreme ones, 
are of equal length. This has the advantage that the determination of the class frequencies 
is easy if the end-points are simply related to the scale of measurement, and also makes the 
histogram simple to assess visually. The number of class intervals is restricted above by the 
requirement that the expected class frequencies are not to be ‘too small’ (i.e. to avoid 
deficiencies in the asymptotic theory due to small sample effects) and below by considera- 
tion of loss of sensitivity (i.e. power). While there is no rule for positioning the class boun- 
daries, it is clear that their position is related to the mean of the sample, and that the length 
of the class intervals is conditioned by the variance. Often this means only that the class 
intervals are determined by inspection of the data. Since the data varies from sample to 
sample, the class intervals may well do so. The restrictions on the expected class fre- 
quencies are restrictions on the probability content of the cells, although they do not 
require the probability contents of the cells to be fixed. With a large mass of data, possibly 
the simplest method of obtaining class intervals with ‘reasonable’ probability contents 
(i.e. ‘reasonable’ expected frequencies), is to compute the sample mean and variance, 
% and s*, and construct intervals centred on % with lengths of some chosen multiple of s. 
This requires no judgement and the boundaries can be altered slightly, if necessary, to 
coincide nearly with the scale of measurement. If this is done, we have exactly, or approxi- 
mately, the formulation (2). Another case where our formulation is exactly correct will be 
given below when the work of Mann & Wald (1942) is discussed. Thus our formulation is 
very closely related to practical methods. 

To arrive at something which can be analysed mathematically and which is similar to 
this procedure, the following rule is proposed. For k even, let the sample mean be a class 


22-2 
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boundary, and let an equal number of intervals be marked out on either side of it of a length 
equal to some multiple of the standard deviation of the sample. For k odd, let the sample 
mean be at the centre of a class interval and let an equal number of intervals be marked 
out on either side of it of length equal to the central interval and to some multiple of the 
sample standard deviation. A strict symmetry of the class intervals about the sample mean 
is used in these rules as it leads to mathematical simplicity. In practice, of course, this 
symmetry is usuaily only approximate as it conflicts with the desire to have the class 
boundaries simply related to the units of measurement. 

Mann & Wald (1942) have argued that for reasons of power, the class intervals should 
have equal probability content. They have given a formula for the number of intervals k, 
in terms of N and the significance level to be used, again based on power. These rules are 
proved for an arbitrary distribution, without parameters requiring estimation. This restric- 
tion does not affect the argument for the equal probability intervals and this part of the 
rule has certainly (see §5) been used for the present problem. In fact, it also has the merit 
of numerical simplicity since all the denominators in X? are equal. Clearly, the formulation 
is that of (2), with, in addition, p, = 1/k (1 = 1, 2,...,k). It is easily analysed by the present 
approach since the class intervals are symmetrical about the sample mean. The formula 
for k is certainly no longer valid and, without making an investigation of this point, 
the only assertion that can be confidently made is that many more class intervals should 
be used than is now customary. The present paper provides a quite different reason for this 
recommendation, as will be seen later when the equal probability and equal length interval 
rules are examined. 

As has been remarked by Chernoff & Lehmann and the earlier writers, the effect on the 
distribution of X* of different estimators is not entirely a matter of their efficiency. The 
replacement of a parameter by an estimator usually increases the variability of the results. 
If, in the present problem, an estimator found from the sample under test is used, there is 
a contrary tendency, since the hypothesis is forced to conform to the sample. To differentiate 
between these opposing effects, several cases (which do not seem to arise in practice) have 
been considered, where independent estimators of the mean and variance, also from 
samples of N, are available. It is found, for k large, that the mean and the variance behave 
in the same way—an independent estimator increases, roughly, the degrees of freedom by 
unity, an estimator from the sample under test decreases tne degrees of freedom by unity, 
from the value when the parameter is known. 

The discussion in this paper is limited to the asymptotic distribution obtained when a 
normal distribution is fitted. In further papers a general asymptotic theorem will be proved 
and applied to other distributions, some of which show serious, and others trivial, under- 
estimation of the Type I Error in X? goodness-of-fit tests. Sampling investigations are 
planned to investigate the small sample phenomena. 


2. DISTRIBUTION OF X2—GENERAL CASE 


In the notation of §1, let ,, 5, ...,, be the areas desired above the class intervals, as 


defined in (2) and let n,n, ...,”,, (Lm, = N) be the class frequencies; then the statistic 
whose distribution is required is 


Ti = 5 (m%— Np)? 2 


(3) 
l=1 Np, 
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For this purpose, let indicator variables be defined as follows: 


1 if 2; falls in (Z,_,,Z,) 


7 0 if 2x; does not fall in (Z,_,,Z,), (4) 

where Z, = —00, Z;, = +00. Then o 
% = x), (5) 

i= 


where the joint distribution of Y,(I), Y,(l),...,¥y(l) is unchanged by permutation of the 
subscripts. Thus 


E(m) = NP(Y,(l) = 1), (6) 
E(nj) = NP(Y,(l)=1)+N(N —1) PY) ¥,(1) = 1), (7) 
E(nn,) = N(N -1) P(Y,(0) ¥_(m)=1). (8) 


Since, by (5), the n, (J = 1,...,&) are expressed as sums of N variables, it will in general 
be true when N tends to infinity that the joint distribution of n,,...,, is multivariate 
normal with a mean vector and covariance matrix obtainable from (6), (7), (8). X? may be 
written as nail (n,—E (m))? 

Np, 
ay, E(t) (E(u) — Np») 
Np, 
(E(m) — Np,)? 
+2 Vp, : (9) 
The method will be to show from the joint distribution of the n,’s that the last two terms 
on the right-hand side of (9) contribute nothing to the limiting (NV = 00) distribution of X?, 
while the first term is distributed as a quadratic form in normal variables. For this it is 
necessary to determine P(Y,(/)=1) and P(Y,(l) ¥,(m) = 1) to terms of order O(N-"). 

This programme will now be carried out for the normal distribution. It is clearer to 
divide this into two sections—first when only the mean requires estimation, and second 
when both the mean and the variance require estimation. 





3. NORMAL DISTRIBUTION, MEAN UNKNOWN, VARIANCE KNOWN 
Let 2,, 2, ..., 2,1 be such that 


2 


1 
Py = p(t) dt, (10) 
od oe 
where z) = —00, z, = +00, f(t) = (27)-texp (— }#2). If the unknown mean is estimated by 
the mean % of the sample under test, it is clear that the Z; of §2 are simply z;+%. Hence, 
m is the number of 2,%»,...,%y in (z_,+%,z,+%), which is the same as the number of 
%—%,...,%y —Z in (z_,, 2). The indicators will be defined by 





1 if 2«,—Z falls in (z,_,, 2), 
(l) = 11 
¥i(l) * if «,—% does not fall in (z,_,, 2). ls 
It is convenient to use t; = (%,—%)/,/(1—1/N) since then ¢,,t.,...,ty are jointly normal, 
with zero means, unit variances and correlations p = —1/(N — 1). Thus the joint density of 
t, and f,, to the required order, is given by 
exp{—4(4+8 
Pt a iT} 01 4 tyes}. (12) 
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To make the subsequent formulae simpler, it is convenient to introduce the notation 


'(1—1/N)~# 


®,(1,N) = i) t(t) dt. (13) 


2_,(1—1/N)-# 


. »vity the arguments J and N are only shown when there is any doubt. From (11), 
=1) and P(Y,(l) Y,(m) = 1) are obtained by integrating (12). Thus 





P(Y,()=1) = ®,()), } (14) 
P(¥, (1) ¥,(m) = 1) = Do(Z) Oy(m) + p®, (1) D,(m). 
“B73; > hf 
— a el) = 6%) — 24-161) (15) 
- is essily seen that, to the orders in which they will be required, 
Dy = + a) ; | 
16 
®, — Yor Fs ? ( 
®, = p,-V- 
ei2ce E(nm) = Np, + ¥(1)/2, 
var (m) = N(p,— pi — Yo())), (17) 


covar (1%, %m) = N(—PPm— Poll) o(m)). 
Putting p) = [y,(1),....¥,(h)], 1 = [1,...,1](1xk) and D(p,)=D for the kxk matrix 
with the p, down the main diagonal in order, the covariance matrix of the n,’s may be written 
” A, = N{D-D11'D— 4 h}}. (18) 


Thus, asymptotically, the second last term of (9) is a normal variable with zero mean and 
variance 


1 ’ , , 
y PDD DID — hj} Dp, 


and so converges stochastically to zero as N +0, while the last term is equal to :j),/4N 
and also tends to zero. The first term is distributed as 


Ay YitAgyZt «+ +ARVis (19) 
where ¥;, Y2, ---,Y¥, are independent standard normal deviates and the A, are the latent 
roots of 1 

wD A = I-11’D-D“ . (20) 


Since (20) can be written as the sum of the matrices 


D~*Ho ho D~ hob 
I-11'D- ——?,, (1-D-») 

PeD-Hpy? OVOP) GED=Hy 
the first of which is idempotent of rank k — 2, the second of which is a multiple, 1 — ~jD-h,, 
of an idempotent with a zero product with the first matrix, the roots A, of (20) are 


1—,D—p,, 1 (k—2 times). (21) 








oO 


4) 


5) 


16) 


Po» 


(21) 
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Thus the distribution of X? is here ’ 


Xit-2 + (1 — po Dp) yi. (22) 


The root 1 —,D~-1), is shown in Tables 1 and 2 for various values of k and types of class 
intervals. It lies between 0 and 1 and tends to zero as k-oo provided the probability 
content of all the classes tends to zero. A heuristic proof of this limit may easily be given 
since k y2(1 
PD-p, = ¥ MO, 
l=1 Py 
| ° (de? 
-o Ode’ 





=1. (23) 


To study the effect of using the mean m of an independent sample of N, instead of Z, to 
estimate , consider new variables t; = x;—m. They are jointly normal with zero means, 
variances 1+1/N and correlations 1/(N +1). Here m, will be the number of ¢; which fall in 


(z_1,%). Introducing suitable indicators to achieve this, the previous argument may be 
repeated and it leads to 


E(m) = Np,- ty), 
var (™) = N(p,+ pi + Wo(2)), (24) 
covar (m, Nm) at N( —PiPm - Yo(l) Yo(m)). 


Thus, in this case, X? is distributed as 


Xi-2 t+ (1+ p, Dh) yi, (25) 


which, for large k, is equal to a y2_.+ 2y?. 


The results (22), (25) and the standard result when yz is known may be written together, 
when k is large, as 


Xi-2+2y? (mean estimated from an independent sample of NV’), 
X2 = 4y%_, (mean known), 


Xi-2 (mean estimated from sample). (26) 


The results (26) could be interpreted roughly to mean that the correlation (orthe conformity 
of the sample to the hypothesis) from using the sample mean is ‘worth’ two degrees of 
freedom in overcoming the effect of variability which is ‘worth’ one degree of freedom. 


4. NorMAL DISTRIBUTION, MEAN AND VARIANCE UNKNOWN 


First it will be supposed that the mean and variance of the normal distribution are estimated 
by % and s?, found from the sample under test. In this case Z,, Z», ..., Z;,_, will be found from 


ied ad ao 


t—1 
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so that Z, = %+2z,s. Thus Z,_, <2; < Z, is the same as 





i.e. t:— 


The variates t; = (x; —%)/s have an awkward distribution with a finite range 
(—/N, JN) (1+ (N —1))-4. 
By an argument, often used for serial correlations (see, for example, Watson (1955)), 


E{(x, —2)* (a_—%)?} 

E(t t}) bg ae E(s2+>) ae 

so that all the joint moments of ¢, and t, are avail ole. An asymptotic expansion for f(t, t,) 
can be found by writing (with H,(x), the ith Hermite polynomial as defined by Cramér 
(1946)) 





Hbrte) = 6) O6)( EB au Hilt) Hilt) 
=0 i= 
and evaluating the coefficients a;; up to and including those of order N-!. Since 


f(t, ty) si Sf (te, t;), 


we have a,; = a,;. The first four moments of ¢, (and f,) are 


aay op afi ta, 





~ Wey 
: -l 2 2 
— fu = WI1? ba = (1+ Ga) (?- y53)- 
Hence (to order required) Ao = 1, Mo, = —1/4N, 


ay, = —1/N, gg = —1/2N; 
a,;=0, otherwise, 


as may be shown by further calculations, so that 





Flt ta) = Ht) tg) [1 A? — SA) — 


(27) 
To apply the method of §2, indicators Y,(/) are defined as in (11) but using (x;—2)/s 
instead of x;~%. From (27), it follows that 
1 


P(Y,() =1) = ,— ty (®,—6®, + 39,), 


PU) Ym) = 1) = G0) g{on)— 2a) _ On) — Sul) (Palen) — Pele) 


_ (,4(2) — 6D, (1) + 3D 4(1)) Do(m) + Oo(L) (O4(m) — 6D, (m) + 3D,(m)) 
4N ; 
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Introducing these formulae into (6), (7), (8) and using (16) to simplify t’:em, we find that 
E(n,) differs from Np, by terms of O(N-") so that the last term in (9) vanishes as N +00, 
and that the covariance matrix of the n, is defined by 


var (m) = N(p,— pi — Yo) — 3H), 
covar (%,%n) = N(—P,Pm— Poll) Yom) — S410) Ya(m)). 
Thus, in matrix form, the covariance matrix may be written as 
A, = N(D—D11'D— hj — 44,43), (28) 


which is the same as (18) except for the additional term in q. 
Thus the limiting distribution of X? is, in this case, of the form (12) where the A, are the 


latent roots of , 
yD As = T-11'D—D hobo — 3D" H- (29) 


The latent roots A, of (29) are the same as those of 
I-D}11’D?-D-, pi D+ -— 4D, p{D 
so that the “4, = 1—A, are the roots of 
D411'D?+ Dp iD + 4D, pi, D-4, 
1D! 
ie. [DH1, D-'py, 24D-py] yD 
2-4/D-4 


But, by a well-known matrix result, the non-zero latent roots of AA’, where A is an arbitrary 
matrix, are the same as those of A’A. Since 


1D1=1, (1’D+)(D-y,) = 1'h, = 0, (1'D4) (D-4y,) = I’, = 0, 
the y are the latent roots of 
0, 0 
PD, 2A D-*, |. (30) 
2+PD-p,, 3D, 


Thus, the roots A, of (29) are 1 (k — 3 times), 1 — 4,, 1 — a, where 4, andy, are the rootsof (30). 
For arbitrary systems of class intervals ty,D~'), is not zero and the quadratic equation 


| PDip-H PD, 

| PD hb, =, Dp, — 24 
must be solved. If, however, the intervals (—00, Z,), (Z,, Z.), ..., (Z;,1, ©) are symmetrically 
placed with respect to the sample mean—or the intervals (— 00, 2;), (21, 2), »-+» (Z-1, ©) 


symmetrically placed with respect to zero—inspection shows that ty,D~1), is always zero. 
In this case, the roots of (N D-") A, are therefore 


Pe 


Ay =(1-PjD-'py), Ay =(1-HP{D-'p,), A= 1 (k—B3times). (31) 
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Since the second term of (9) vanishes stochastically as N 00, we have the result that the 
limiting distribution of X? is that of 
Xe-3 tALY{ + Agy}- (32) 
It is shown in the tables that A, tends to zero as ko, although less rapidly than A,. 
Finally, to see whether the effect of using an estimator is the same for the variance as for 


the mean, we consider the case where the sample mean and an estimator of variance with 
N — 1 degrees of freedom, from an independent sample, are used. In this case 


«= sJ(1=1]N) 
have, as has been shown by Cornish (1954), the joint density 
T'[3(N + 1)] (1—p?)-+ ( Q Ke 
T[4(N — 1)]a(N -1) N-1 
where p = —1/(N—1), Q = (@—2ptyt + 13)/(1—p2). To O(N) 
et? 1 et 
gg eee eee 33 
so that, defining new indicators Y,(l) in the obvious way, 


P(Y,(1) ¥,(m) =1) = P{¥,(1) Y,(m)=1, when o? is known} 
1 (% [%m e-MiHi) (2408) + 12+ dt, 
i WV 14 on {-(4+4 it’s 1M, 


where, in the integral, N has been put equal to oo since this term is already O(N-"). Thus 


P(Y, (1) Y,(m) = 1) = P{Y¥,(l) Y,(m) = 1, when o? is known} 


(i = 1,2) 








> 














«A { 90) 04m) 0,0) 05omys ADAM + 2D Hn) +OIDOIO) ay 
Since fit) = $(0)+ xo) | 2-14 SFE), 
P(()=1) = + y{-O.- + eat, (35) 


Introducing (34) and (35) into (6), (7), (8) and using the expansions (16), the covariance 
matrix of the n, is here found to be 


A; = 1—11’D —D“ hho + 3D", 3. 
Following the previous method, X? has as its limiting distribution 

Nia t AYE + Asy3, (36) 
where A, =1-PjD—,, A, = 14+ 4h{D-),, (37) 


provided Dp, = 0, for which symmetry of the intervals (z,_,, z,) about zero is a sufficient 
condition. Since A,-> 2, as all the intervals tend to zero, we have then 


Xi-3+ 2y2 (variance estimated independently of the sample), 
X? =; xi. (variance known), 


Xi-3 (variance estimated from the sample). (38) 
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In each case of (38) it is assumed that the sample mean has been used. Comparing (38) 
with (26), it is seen that the variance behaves in the same way as the mean, in this regard. 


5. NUMERICAL RESULTS 


Only the practical case where the mean and variance have been estimated by the sample 
mean and variance, will now be considered. Restating the results (31), (32), for the simple 
case of symmetrically placed intervals, we have 


X?~ XE _stAyit Asyi, 
Wis 2) annie 

where amt Pa Dp, ° lia 2 2 PD - 
and Wo, w, are defined in (15). 

In Table 1, the values of A, and A,, H(X*) and k—3 are given for various k when class 
intervals of equal probability content are used, in the spirit of Mann & Wald (see § 1). 

In Table 2, the same quantities are given for the rule of § 1 using intervals of equal length. 
A further column gives h, the length of the interval (z,_,,z,), for 1 = 2,...,4—1. 

It is now of interest to compare our results with those of Chernoff & Lehmann. As described 
in § 1 they used a fixed system of class intervals (— 00, Z,), (Z,, Z.), ..-, (2,1, 00) and a normal 












































Table 1 
k AX - k-3 E(X?) 
2 0-363 1 -1 0-363 
3 0-207 0-773 | 0 0-980 
4 0-139 | 0-619 1 1-758 
5 0-103 | 0-592 | 2 2-695 
6 0-081 0-459 | 3 3-540 
10 0-042 | 0-297 | 7 7-339 
> 00 0 | 0 k-3 k-3 
| | | 
Table 2 
| | 
| k h AX Ay k-3 E(X?) 
| | 
| ; | anual 
| 2 00 0-363 1-000 -1 0-363 
3 1-0 0-197 0-738 0 0-935 
4 1-0 0-118 0-459 1 1-577 
4 0-5 0-174 0-738 1 1-912 
4 0-4 0-201 0-797 | 1 1-998 
5 1-0 0-088 | 0-272 2 2-360 
5 0-5 0-116 | 0-584 | 2 2-700 
5 0-4 0:145 | 0-676 | 2 2-821 
6 1-0 0-080 | 0-184 | 3 3-264 
6 0-5 0-077 0-440 | 3 3-517 
10 1-0 0-077 | 0-145 7 7-222 
10 0-5 0-025 | 0-103 | 7 7-128 
> — 0 | 0 | k-3 k-3 
| 
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population with mean y and variance o*. Their result is of the same form as ours and A, and 
A, are again the roots of a certain matrix. If Z,,Z,,...,Z,_, are expressed in terms of 
24, 2g, «++» Zp, by the relations 
Z=pteao (l=1,...,k-1), 

their results should be comparable with ours when one sets “ = 0, 0 = 1 in the matrix given 
by Chernoff & Lehmann for the roots 4, (A; = 1—). In fact, the matrices are then identical. 
This is also seen arithmetically in Table 2; the values of A, and A, for k = 4, h = 0-4 are the 
same as those in Chernoff & Lehmann’s example where the intervals are (— 00, — 1), (—1, 0), 
(0, 1), (1, 00) and w = 0, o? = 2-5. 

This identity of our results with those of Chernoff & Lehmann when yu = 0, o? = 1 has 
a simple interpretation. It means that, asymptotically, there is no difference between using 
(4, o) and (%, s*) for positioning and scaling the class intervals. Thus the variation of the 
class intervals from sample to sample has no effect on the distribution of X*. The reason why 
X? here is not distributed as y}_, is the same as that pointed out by Chernoff & Lehmann 
—the use of estimators based on the sample values and not on the cell frequencies. If this 
is generally true, Chernoff & Lehmann’s general theorem will hold whenever the class 
intervals are conditioned by the values of the sample estimators. 

The question, which now arises, is how the class intervals should be chosen. In §1, it 
was seen that considerations of power have led to the suggestion that many more intervals 
be used than now seems to be customary. Williams (1950) has suggested that Mann-Wald 
optimum is a broad one and that little will be lost if their recommended number of classes 
is halved. Our results, especially Tables 1 and 2, suggest that, if X? is to be compared with 
the significance points of xj_3, the number of classes should be large to avoid under- 
estimation of the significance level. Before any specific recommendation can be made, 
agreement must be reached on the precise amount of underestimation which can be tolerated. 
However, any such judgement seems arbitrary and it seems better to give a method whereby 
anyone can derive his own rule. 

The asymptotic distribution of X? is that of a quadratic form in normal variables with 
a dominant part yj_, and small part A, y?+A,y3. The method of Pitman & Robbins (1949) 
is not useful for a quadratic form of this type. But a variant* of the method of Harper & 
Macdonald (1957) is ideally suitable. When & is not much smaller than 10 the probability 
that X? is greater than w is given accurately by 





—hu ytk-1 
T(3k) ay (1—A,)] f — a = 
where a, = (2—43(r, + 2) ($h—1), 
Ay = [44+ §(VE+ V9) — (¥y + Ye) + $Y 2] ($4 — 1) es (40) 
vy = 2A,/(1—A,), 
Yy = 2Ag/(1— Ay) | 


Thus, a given system of intervals may be examined by first computing A, and A, from the 
formulae at the beginning of this section and then using (39) and (40) to find the probability 
that X? exceeds the 5% (say) tabular value of y2_,. Depending on what is considered to 
be a tolerable deviation of this probability from 0-05, the system of intervals will be judged 
as satisfactory or not. 

* Suggested to the author in a letter from J. A. Macdonald. 
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It is clear that no hard and fast rule will find acceptance, but it does seem worth while 
to suggest a working rule of the very simplest kind. Ten intervals is a convenient round 
number of intervals. Then the probability of X? exceeding x7 (5%) = 14-067 may be found, 
from (39) and (40), to be 0-057 for the case k = 10 in Table 1 and to be 0-054 and 0-052 for 
the cases k = 10, h = 1, 0-5 (respectively) in Table 2. Thus, if ten intervals are formed from 
almost any kind of subdivision of (— 00,00), one can be sure that true significance level is 
between 0-05 and 0-06—and usually much nearer to 0-05. Our rule is therefore to use at 
least ten intervals, none of which should have ‘small’ expected frequencies. This latter 
provision is made to reduce the possibility of small sample effects invalidating the asymp- 
totic results on which the rule rests. 

The methods developed above have a bearing on a slightly different kind of problem, 
brought to my attention by J. B. Douglas. J. F. McIllwraith of the Storm Water Standards 
Committee, Sydney, examined the primary rain falls at 142 different stations to see whether 
the hypothesis of log-normal rain falls was tenable. A goodness-of-fit X? was calculated 
from the results at each station, six equally probable intervals being used so that the cell 
boundaries were given by 


—oo, #—0-9674s, Z—0-4307s, 0, Z+0-4307s, %+0-9674s, +00. 


The mean X? was 3-80 and Douglas observed that, because of the work of Chernoff & Leh- 
mann, this need not imply that the hypothesis should be rejected. With the results of the 
present paper, the matter may be taken a step further since these X* have been calculated 
exactly on the assumptions of this paper. In particular, from Table 1, k = 6 gives A, = 0-081, 
A, = 0-459, so that on the null hypothesis, these X? are distributed as 


x3 + 0-08 1y? + 0-459y3. 
Thus H(X?) = 3-540, 


var (X?) = 6-217. 


The mean of 142 values of X? will be approximately normal with mean 3-540 and variance 
0:04378. Since eee anes 
“Jo-oaa78 = 4 
the data, assuming homogeneity and independence of stations, do not refute the hypothesis. 


The author is grateful to Prof. P. A. P. Moran and Dr E. J. Hannan for some helpful 
discussions, to Mr J. A. Macdonald for providing his unpublished method of deriving the 
distribution of X?, to Mr J. B. Douglas and Mr J. F. MelIllwraith for providing the data in 
§5, and to the Editor and the Referee for the references to previous discussions of this 
problem. 


[Editorial Note added in proof. The Editor unfortunately failed to draw the author’s attention to a 
paper by D. E. Barton, entitled ‘Neyman’s Yj test of goodness of fit when the null hypothesis is 
composite’, read at the International Congress of Mathematicians, Amsterdam (1954), and published 
in the Skandinavisk Actuarietidskrift (1956), pp. 216-45. Taking the rather more general ’? statistic, 
which becomes X? in a special case, Barton considered the use of variable class intervals and reached 
a conclusion very similar to Watson’s. Before going to press it has not been possible to ask Dr Watson 
to examine the relation of these earlier results to his work. E.S.P.] 
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APPROXIMATIONS TO THE DISTRIBUTIONS OF SOME MEASURES 
OF DISPERSION BASED ON SUCCESSIVE DIFFERENCES 


By Y. 8. SATHE anp A. R. KAMAT 


Fergusson College, Poona (India) 


1. INTRODUCTION 


The usual measures of dispersion such as the sample variance, the sample mean deviation, 
the sample range, etc. cannot be used when the mean of the population is undergoing a trend, 
since they are likely to be seriously affected by the trend. Under these circumstances and 
especially if the trend is a slow-moving one the following four measures of dispersion based 
on successive differences have been proposed: 


i; =i 

0? = ——_ > (%j,— X41)", 0) 
N— 2 j21 
1 2-1 

= i |x;—a41|, - 
n i=1 
1] 2-2 

03 = —5, © (@— 2% 41+ 242)", (3) 
nm—2 24 
1 7-2 

dy =——> | 4 — 244, + Fi49|. (4) 
n—2 jo 


The theory of the distribution of these statistics and their usefulness has been discussed 
in some detail by Von Neumann, Kent, Bellinson & Hart (1941) and Kamat (1953a, b, 
1954) among others. These measures of dispersion are considerably less efficient as estimators 
than the sample variance, s?, when the mean of the parent population remains constant. But 
s* suffers from a heavy bias if the mean is undergoing a trend while statistics based on the 
successive differences are comparatively unaffected. 

It is not easy, however, to obtain the distribution of these statistics. Except for a modified 
form of 6? (for which see Kamat, 1955), it has not yet been possible to find the exact distribu- 
tions of these statistics. The procedure so far has been to approximate their distributions by 
appropriate Pearson-type curves based on their first three or four moments. Apart from the 
laborious computation involved in the fitting of Pearson curves to these measures of disper- 
sion, a serious drawback to these approximations is that they cannot be used readily for 
comparing two independent estimates of variability about the trend taken over two different 
periods of time. 

Recently Cadwell (1953, 1954) has successfully used a power of x? to obtain approximate 
distributions to some measures of dispersion such as the mean deviation, the mean range and 
others. Following this method, one assumes that a statistic wu is approximately distributed 
as (x$/c)* or, taking A = 1/a, that cu* is approximately distributed as x? with v degrees of 
freedom. The constants c, « (or A) and vy are then determined by equating the first three 
moments. . 

In this paper we have used this method to obtain approximations to the distributions of 
each of the four statistics mentioned above. Our results show that over a wide range of 
sample size the same power of x’, i.e. a corstant value for a or A, can be used for each of 
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these statistics. Consequently, it not only becomes easy to find approximate percentage 
points by using percentage points of x”, but it is also now possible to compare two different 
estimates of variability by use of the F test. This is because we may write 


C1 Vo (U,\* 
pu (i) (5) 
where ¢,,¢,, and v,, 2 are the scale parameters and equivalent degrees of freedom, found 


from Table 1, of the two u estimators to be compared and A is the common power. 


Table 1. Values of v and logc for measures of dispersion 
based on successive difference, keeping A constant 


























| 62/02 d/o | 63/0? d,/o | 
A = 0:7770 A = 1:3219 A = 0-7353 A= 1-2195 | 
nr | 
v logy ¢ v | logig¢ v logio¢ v | logge 
| i. Or, seal i: ere 
| 5 4-879 | 0-4775 6-053 | 0-6957 3:528 | 0-0135 4-562 | 0-0288 
6 5-978 | 0-5618 7-434 | 0-7880 — = | — 
7 7-081 | 0-6326 8-816 | 0-8642 5-398 | 0-1863 7-042 | 0-4822 
8 8-182 | 0-6933 10-193 | 0-9288 --- — — | _ 
9 9-284 | 0-7465 11-579 | 0-9853 — — — | 9 
| | 
10 10-39 0-7941 12-963 | 1-0353 8-24 0-3615 10-815 | 0-6720 | 
| 11-49 0-8368 14-343 1-0800 — — — | —_ 
| 12 12-60 0-8760 15-732 | 1-1208 — — — | sa 
13 13-70 0-9116 17-114 | 1-1579 — — — | — | 
14 14-80 0-9445 18-500 | 1-192] — — —- | — | 
|} 15 15-91 0-9754 19-885 1-2240 12-99 | 0-5530 17-13, | 08744 | 
16 17-01 1-0039 21-27 | 1-2535 — | —_— — | — | 
17 18-12 1-0309 22-65 1-2811 — | — - | _- 
| 18 19-23 1-0564 24-04 | 1-3072 — | -— —- me 
i; 19 20-32 1-0800 25-42 | 13317 — — —_ | me i 
| 
; 20 21-43 1-1028 26°81 | 13550 17-74 06853 23-45 | 1-0119 | 
| 25 26-94 1-2010 33-74 1-4557 22-50 0-7868 29-78 | 1-1164 
| 30 32-47 1-2814 40-64 1-5370 27-26 | 0-8690 36-12 | 1-2007 
| 40 43-52 1-4076 54-50 | 1-6651 36-77 | 0-9975 48:8 | 1-3319 
| 50 54-56 1-5052 68-33 | 1-7637 46-28 | 1-0966 61-4 | 1-4320 
| | 




















The basis of Table 1 is described in § 3-7 below. It may be mentioned here that we have 
considered these statistics only for »>5. The approximations are not expected to be 
adequate for lower sample sizes, especially if A is taken as constant and, moreover, it is 
unlikely that statistics based on successive differences will be used for very small samples. 

It seems probable that tests based on the mean-square differences, 5? or 43, will in general 
have somewhat greater power of discrimination than those based on the mean absolute 
differences, d and d,.* The possible advantages in using the absolute differences are: (i) speed 

* If the variation is random and there is no trend, the asymptotic efficiencies for estimating 0? are 
known to be as follows: 

62: 66-7%, 83: 51-4%; d: 60-56%, d,: 47-1%. 
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in calculation, (ii) greater accuracy in the (x?/c)* approximation (see Tables 3, 6, 7 below), 
(iii) greater robustness if the variation is not normal. 


2. EXAMPLES 


Before proceeding to examine these approximations in more detail, we shall illustrate the 
use of the proposed variance-ratio test in comparing variability about the trend in a time 
series. The data used are dust readings taken with a Tyndallometer in a mine shaft. The 
readings were taken at half-minute intervals and the four series A, B, C and D were collected 
on the same morning during different periods, starting at the times shown. We are indebted 
to the Director of the Safety in Mines Research Establishment, Sheffield, for permission to 
use these data. The figures quoted are Tyndallometer readings multiplied by 1000, and the 





| 


| Series A (starting at | 173 | 176 





| 130 | 153 | 137 | 144 | 95 | 187 | 131 | 126 
10.04 hr.) | 167 | 118 | 99 | 140 | 111 | 81 | 83 | 106 | 104 | 117 
| “78 | 88 | 119 | 87 | 124 | 125 | 102 | 109 | 92 | 114 

| 

| 

Series B (starting at | 70 | 28 52 49 36 | 38 45 43 | 38 57 
| 10.20 hr.) | 40 | 51 | 56 | 58 | 39 | 25 | 33 | 45 | 45 | 34 
| 3 | 42 | 20 | 34 | 18 | 30 | 23 | 23 | 19 | 30 





99 | 59 | 165 
| 162 | 187 | 115 
| 137 | 99 | 106 
| 


sequences in time read from left to right and from line to line. The four series are plotted in 
Fig. 1. It is clear that a trend is present in series A, B and C and poss ibly in D. For random 
series, it is known that* 


| Series D (starting at 


| 
10.47 hr.) | 45 | 43 | 36 
10.57 hr.) | 








| 
| 
Series C (starting at 33 | 43 37 | 40 43 | 40 | 31 36 51 36 
| | 











&(6?) = 207; &(d) = 20/,/m7 = 1-12840; &(63) = 607; E(d,) = 2,/ 30/./7 = 1-95440. 


The upper section in Table 2 gives the four mean successive differ ences as defined in 


equations (1)-(4) above for each series. Below these are five estima tes of the residual 
standard deviation, a, i.e. 


{X(a;—2%)?/(n—1)}4, /(462), d/1-1284, ./(463) and d,/1-9544. 
For series D the random fluctuations are so great that the removal of a linear or parabolic 
trend has scarcely reduced the estimate of variability. For the other three series the removal 
of a linear trend has an appreciable effect and for Series B and C there is a further, if smaller, 
reduction after allowance is made for a parabolic trend in the estimates based on 63 and d,. 


We shall now use the variance-ratio approximation to compare the residual variation in (i) A 
and D, (ii) B and C. We have from equation (5), 


log F = loge, —logc, + log (v,/v,) +A log (u,/u9), (6) 
where the parameters log c,, log cy, v1, v, and A derived from Table 1 are given in Table 27, 


* Values for d and d, have been taken from Kamat (19536, p. 119) and (1954, p. 7), respectively. 
t Interpolation in Table 1 was necessary for the case n = 33. 
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while w, and u, are the corresponding values of 5°, d, 53 or d, for the two series compared, also 
taken from Table 2. 

Comparison of A and D In 
| | —_— 
| | Bait Vy Ve Comment on significance 

Sia ci adherend penton cama Cael < 3s 
Comparison based on 6? | 2-05 26-94 35-78 Just significant at 2-5 % level | 
| “dé ‘78 33-74 44-80 Between 5 and 2-5 % levels | | 
| 3 | 2-02 22-50 30-11 Between 5 and 2-5 % levels 
| dz | 1:51 | 29-78 | 39-92 | Just not significant at 10% level |__ 
| | The c 
, , nee , . , at lea 
The analysis suggests that the residual variation in Series D is greater than in A. We note, test | 
however, two points: (a) that the tests based on the mean-square differences, in this case at 
any rate, give clearer verdicts on significance than those based on the corresponding mean 
absolute differences; (b) that nothing has been gained by using second-difference estimates. 
180 Series A 3 
160 
140 — 
120 _—-<ffean_ 
100 
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80- Series B 
- 2 ae _Mean__ =k 
20- Es 
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Fig. 1. Tyndallometer readings ( x 1000) of dust in a mine shaft 
taken at half-minute intervals. 








| 
| 
| 
i a 
| ale 


The first point is in accordance with what would be expected if the power of the former tests 
is the greater. With regard to (b), the chart for Series A suggests a curved trend, but this is 
perhaps not adequately represented by a second-order parabola. 
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Comparison of B and C 
In this case we have only used 6? and 63; we find the following results: 


353 





| 
| 
| 
| 


The charts and standard deviation estimates given in Table 2 suggest that itis necessary to go 
at least to second differences before a legitimate comparison can be made, but clearly neither 
test provides any evidence for a difference in residual variation between the two series. 





F 
Comparison based on et 1-05 
63 1-13 











— a 





Vy Ve 
21-43 ag | ae 
27-26 aa Not significant 








Table 2. Data for comparison of Series A, B, C and D 



























































A B C D 
n 33 30 20 25 
é 739-03 194-28 206-89 1852-25 | 
d 22-781 10-759 10-895 35-250 | 
A 2282-35 501-04 418-44 5914-96 | 
d, 43-355 17-750 16-889 61-043 | 
Estimates of o from X(a,—2%)? 26-32 12-85 31-35 33-39 
é 19-22 9-85 10-17 30-43 
d 20-19 9-53 9-65 31-24 
3 19-50 9-14 8-35 31-40 
d, 21-67 9-08 8-64 31-23 
Parameters from Table 1 | Ford? A 0-7770 0-7770 0-7770 0-7770 
v 35-78 32-47 21-43 26-94 
log ¢ 1-3235 1-2814 1-1028 1-2010 
Ford A 1-3219 — — 1-3219 
v 44-80 — - 33-74 
log c 1-5798 — — 1-4557 
Ford; A 0-7353 0-7353 0-7353 0-7353 
v 30-11 27-26 17-74 22-50 
log ¢ 0-9120 0-8690 0-6853 0-7868 
| 
| 
Ford, A 1-2195 — — 1-2195 
| y 39-92 sins om 29-78 
| log ¢ 1-2447 — - 1-1164 
| 





23-2 
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3. Basic APPROXIMATION 


Cadwell (1953) has shown that if u is approximately distributed as (x2/c)*, then v and « are 
approximately given by the equations 





_ 2  f,_ (2V—V2,)? (V+ VAs) 
"= av—Jar\ 2(3V —JB,) + | ; ") 
a= a,— Want... (8) 


where « = V,/(4v), and V and f, are the values of the coefficient of variation and the f, 
constant of wu. The constant c is then obtained from the equation 

loge = log 2+ A{log I'(« + $v) — log I'($v) — log &(u)}, (9) 
where A = a~!. These relations have been used to find approximations to the distributions 
for each of the four statistics 6?/0?, d/o, 63/07 and d,/o. The exact value of « was found by 
inverse interpolation making use of Brownlee’s (1923) tables, and for small sample sizes 
double inverse interpolation was used to get the exact values of v and a. (This has been 
outlined by Cadwell on p. 338 of his paper.) Unlike Cadwell we have retained one or more 
places of decimals in v. In our case we find that by so doing it is easier to make the match 
of V and £, become more exact. It may be mentioned that the retaining of more places in y 
does not involve appreciably more labour in the calculation of percentage points of these 
statistics, since in practice it was found that linear interpolation for v in the percentage 
points of y? is quite adequate for this purpose. 


4. THE MEAN-SQUARE SUCCESSIVE DIFFERENCE: 6? 


Following the method summarized above and taking the moments from Von Neumann 
et al. (1941), v, A and log c were obtained for different values of n for the statistic 4?/o°. 
Table 3 gives these values for a few representative sample sizes, n = 5, 10, 20, 30, 50. 
The fifth column gives the difference between the value of £, for the approximating distribu- 
tion* and the true value of £,; form > 5 the difference is at the most in the second decimal 
place and it goes on decreasing as the sample size increases. 


Table 3. Approximation to 6?/a? by (x2/c)* 




















Variable A | Fixed A = 0-7770 
n aie oe) Ey eee oe | — 
v | A log ¢ fs, difference | f, difference | /, difference 
ee en ne a Eee a — 
| 5 4-98 0-7762 0-4788 | + 0-1007 — 0-0013 +0-0913 
10 10-40 0-7766 0-7946 + 0-0380 + 0-0001 + 0-0368 
| 20 | 21-4 0-7776 1:1020 | + 0-:0163 + 0-0003 +0-0180 | 
| 30 32-5 | 0-7765 1-2819 +0-0123 +0-0001 +0-0113 
50 54-4 | 0-7781 1-5036 + 0-0050 + 0-0003 + 0-0070 
| | | 








* Following Cadwell (1953), we obtain /, and, later, £, values for the approximation by noting that 
for x = (4x2) , the rth moment about zero is 


fy =T (ra + 4v)/T (hv). 








mon 
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A variable value of A does not permit us to construct tests corresponding to F tests. But 
the fuller table of values of A (which is not reproduced above) indicated that a suitable fixed 
value of A may be used as a compromise value for n > 5, thus allowing us the comparison of 
two independent estimates of variability based on 6. It was found that if we take A = 0-7770 
it yields reasonably good approximation for n > 5. Table 1 gives the values of v and log c for 
the fixed A = 0-7770 for n = 5(1) 20, 25, 30, 40, 50. In this case v is calculated from the 
_—— vy = Vy +0-0824—0-1436 yg! +..., (10) 
where vy = 3-3127/V?, V? being the square of the coefficient of variation of 6?. But this 
formula is useful only for n> 9, while for n <9 the exact value of v is obtained by inverse 
linear interpolation for the value of V2, by using Brownlee’s tables. Log c was then evaluated 
from the relation (9) above. 

We should expect that , will not match exactly when A is kept fixed. As mentioned 
above, in this case also we have retained two or three places of decimals in v so astomatch /, 
as closely as possible. The discrepancy between the /, value of the approximation and the 
true value of £, is now small, and it was found that it is only in the fourth decimal place for 
n>7. Again, the discrepancy between the /, values of the approximation with a fixed A and 
the true value of f, is of the same order as the discrepancy in these values resulting 
from a variable A. Both these facts are illustrated by columns 6 and 7 of Table 3. 


5. COMPARISON OF APPROXIMATE PERCENTAGE POINTS OF 62/0? 


The approximate percentage points of the distribution of 5?/7? can now be readily obtained 
by the use of the table of percentage points of the y? distribution constructed by Hald & 
Sinkbaek (1950). Ordinary linear interpolation in these tables for fractional degrees of 
freedom is quite adequate to obtain the approximate significance points correct to two 
places of decimals. 


Table 4. Comparison of upper 5% points given by various approximations to 6?/c* 








Approximations Lower Upper Lower Upper 





1% | 5% 5% 1% 1% 5% 5% 1% 





Moore (1) 0-49 0-77 3°70 4-71 0-63 0-90 3-45 4-27 
Moore (2) 0-56 0-80 3°72 4-78 0-68 0-93 3-46 4-32 
(x2/c)*, a fixed 0-53 0-79 3-71 4-78 0-66 0-92 3-46 4-32 
Pearson type VI 0-54 0-79 3°71 4-78 0-67 0-92 3-46 4-32 
































Various approximations have been recently suggested for the distribution of the mean- 
square successive difference. For instance, Moore (1955) has suggested two approximations: 
(1) 62/02 ~ cx?/v, and (2) 62/0? ~c,x?/v+c,. The first is obtained by matching the first two 
moments and the second by matching the first three moments. Gayen & Jogdeo (1955) have 
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followed the method suggested by Cornish & Fisher (1937) of using the exact distribution of 
the sample variance s? as a first approximation to that of 6?, and utilizing the first four 
cumulants of s* and 6? for obtaining approximations of higher order. We give in Table 4 com- 
parisons of percentage points of 62/0? for n = 15 and 20 obtained from the two approxima- 
tions given by Moore (named Moore (1) and Moore (2)), the approximation (x?/c)* for fixed a 
suggested here and the Pearson type VI approximation based on the first four moments. Of 
these the last may be considered the closest to the true distribution of 62/0? as it is based on 


Table 5. Percentage points for the approximate distribution of 5?/a? (A constant) 















































Lower Upper | 
n . | 
| 
0-5 | #%10 | 85 5 5 2-5 1-0 05 | 
| 
f 5 PA SOS ep ee eee! 
| | 
> 0-07 O11 | 0-18 0-27 5-24 6-35 7-84 8-98 
6 0-11 0-16 | 0-25 0-35 4-91 5-87 7-13 8-09 
7 0-16 0-21 | 0-31 0-43 4-66 5-50 6-61 7-44 
8 0-20 0-26 0-37 0-49 4-46 5-21 6-20 6-95 
| 9 0-24 0-31 0-42 0-55 4-30 4:99 5-88 6-55 
| 
| 10 0-28 0-35 0-49 | 0-60 4-16 4-80 5-62 6-23 
| qt 0-32 0-39 0-53 | 0-65 4-04 4-64 5-40 5-97 
12 0-36 0-43 0-56 | 0-69 3-94 4-50 5-21 5-74 
| 13 0-39 0-46 0-59 | 0-73 3-86 4-38 5-05 5-54 | 
14 0-42 0-50 0-63 | 0-76 3-78 4-28 4-91 5-37 | 
| | 
| 15 0-45 0-53 | 066 | 0-79 3-71 4-18 4-78 5-22 | 
| 16 04s | 056 | O69 | 0-82 3-65 4-10 4-67 5-09 | 
| 051 | 0-59 0-72 | 0-85 3-60 4-03 4-57 4-97 
| 18 0-53 | O61 | 0-75 | 0-88 3-54 3-96 4-48 4-86 
| 19 0-56 0-64 | 0-77 | 0-90 3-50 3-90 4-40 4:77 | 
20 058 | 066 | 0-79 | 0-92 3-46 3-84 4-32 4-68 | 
| 25 0-68 | 0-76 | O89 | 1-02 3-29 3-61 4-03 4-32 | 
| 30 0-76 | O84 | 097 | 1-09 3-16 3-45 3-81 4-08 
| 40 0-88 | 096 | 1:08 | 1-20 2-99 3-23 3-53 3°74 | 
50 0-98 101 | #4217) |) 1-27 2-88 3-09 3-34 3-53 
| \ | 





the first four moments, while all others are based on three moments. (The percentage 
points obtained from the approximation suggested by Gayen & Jogdeo (1955) are not 
given here as they do not appear to be as close to the Pearson approximation as the values 
obtained by the other three methods.) 

It appears from this comparison that the approximation (x2/c)* is quite close to the 
Pearson-type approximation. The approximation, Moore (2), based on the first three 
moments is equally close; however, F tests based on it will not be as simple as those based 
on (x2/c)* for a fixed «. The approximation, Moore (1), which would be ideal for constructing 
F tests does not appear to be sufficiently close in the tails of the distribution. 

We have not seen in the relevant papers on 6? any table of its approximate percentage 
points. We are therefore giving in Table 5 the approximate upper and lower 0-5, 1-0, 
2-5 and 5-0 % points for 6*/0? for n = 5(1) 20, 25, 30, 40 and 50 based on (x?2/c)*, a fixed. 
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6. THE MEAN SUCCESSIVE DIFFERENCE 


In the same manner using the moments given by Kamat (1953, p. 119), v, A and log ¢ were 
obtained for the mean successive difference d/o. A compromise value for a common A was 
then found as A = 1-3219, and for this fixed value of A, v was calculated from the formula 

v = vy + 0-0593 — 0-0610 v9 1 + ..., (11) 
where vy = 1-1446/V?, V being the coefficient of variation tor d/o. Log c was then obtained 
from the relation (9).* Table 6 gives values of v, A and log c for the representative sample 
sizes n = 5, 10, 20, 30, 50 and columns 5, 6 and 7 in that table give the difference in /, values 
for a variable A, difference in /, values for the fixed A and the difference in /, values for the 
fixed A, respectively. Table 1 gives values of v and log ¢ for this fixed A = 1-3219 for 
n = 5(1) 20, 25, 30, 40, 50. We need not repeat the arguments and the conclusions in this case 
which are similar to those in the case of 6?/a? presented in §4 above. A comparison of the 
f, differences in Tables 3 and 6 suggests that the approximation to the d/o distribution may 


Table 6. Approximation to d/o by (x2/c)* 























| | i aan 
| 
| | A= 13219 
n v A log c fez difference |— 
| 
| | | f,; difference | /, difference 
a eis J ine 1 ool = Met cone 
| | | 
5 6-16 | 1-3101 0-7047 +0-0445 —- 0-0066 +0-0301 
10 13-02 | 1-3189 | 1-0375 + 0-0157 — 0-0008 +0-0144 
20 26-8 13221 | 1-3549 +0-0070 —0-0001 +0-0077 
30 40-5 1:3242 | 1-5354 —0-0015 + 0-0002 + 0-0065 
50 | 67-9 13261 | 1-7608 +0-0058 +0-0001 + 0-0028 
| 














be closer than that for 62/0. In his 1953 paper (p. 120) Kamat gave a tabie of percentage 
points of the distribution of d/o obtained by using a Pearson-curve approximation, having 
the correct first four moments. We have compared these results with those obtained from the 
(x2/c)*approximation nowrecommended (with A = 1-3219) at n = 5, 10 and 50 at both upper 
and lower 0-5, 1-0, 2-5 and 5-0 % points and have found a most satisfactory agreement. The 
two series of figures were usually identical to the two decimal places given and never differed 
by more than a unit in the second decimal place. We do not, of course, know the true 
distribution of d/o, but general experience suggests that the 4-moment Pearson curve 
representation is likely to be good.} We think, therefore, that our present approximation in 
terms of a power of y? is likely to be entirely adequate for our purpose. 


7. STATISTICS BASED ON SECOND SUCCESSIVE DIFFERENCES: 63/0? AND d,/o 


The same method, taking the moments given by Kamat (1954, pp.4,7), was used to 
approximate to the distributions of 53/0? and d,/o. Here also by trial we found that a fixed 


* Unlike 6°/o? the formula (11) was found to be useful for all n>5 and linear interpolation in 
Brownlee’s tables for V? was not necessary. 

t In the single case n = 3, where the distribution of d/o is far from normal, having /, = 1-0356, 
/, = 4-2962, Kamat (19536, p. 121) wasable to derive the true distribution and found that the 4-moment 
fit corresponded very closely with the true curve. 
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A = 0-7353 is most suitable for 43/0? for n >7 and a fixed A = 1-2195 is adequate for d,/c? 
for n> 5.* 


In the case of 63/0? for the fixed A = 0-7353, v was obtained from 
v = Vp t+ 0-1296 — 0-2406r91+..., (12) 


where v) = 3-6992/V?, V being the coefficient of variation of 6? (this formula is useful for 
n > 10; for n < 10, the exact value of v, is obtained by linear interpolation as explained in § 4). 
For d,/o with A = 1-2195 kept fixed, v was calculated from 


v = vy + 0-0324 —0-0358r52+..., (13) 


Table 7. £, and f, differences for the (x?/c)* approximation to the distributions 
of 53/0? and d,/o using a fixed A 





























63/02, A = 0-7353 d,/a, A = 1-2195 
n = 
A, difference f, difference A, difference f, difference 

5 40-0749 | +0-4517 — 0-020 + 0-088 

7 —0-0008 + 01400 — 0-008 + 0-045 
10 —0-0008 | + 0-0782 —0-001 +0031 
15 +0-0005 | — + 0-0482 +0-001 + 0-022 
20 | +0001 | +0-0354 + 0-002 +0-016 
25 CO +0-0009 | +0-0273 + 0-002 + 0-013 
30. + 0-0009 + 0-0222 +0-001 +0-011 
40 | + 0-0007 + 0-0162 +0002 | + 0-006 
50 | + 0-0006 +0-0129 +0002 | + 0-007 

a | us ak 








where v, = 1-3448/V? and V is the coefficient of variation of d,. In both cases log c was then 
calculated from the equation (9). 

Values of v and log c for 63/o? and d,/o: for sample sizes n = 5, 7, 10, 15, 20, 25, 30, 40, 50 
have been already given in Table 1. Discrepancies in the /, and /, values for fixed values of A 
are given in Table 7. We have not discussed here these approximations to the distributions 
of 53/0? and d,/o to the extent we have done above in the case of 5?/? and d/c, since they are 
in less frequent use than the latter. 


In conclusion we wish to thank Prof. E. S. Pearson, who has drawn the attention of one of 
us to Cadwell’s approximation and suggested improvements in the drafting of the paper. 
* For n = 5 the difference in /, was considerably larger for a fixed A than for a variable A for 62/0°, 


but not so for d,/o. This is borne out by the following values of f, differences for variable A compared 
with those given in Table 7. 








7 | 
| n v A f, difference 
| | | 
| ay pa re Heese See Reis Event ener a) 
| | | | 
| d/o2 | 5 3-418 | 0-7463 | +0-2939 
| djo | 5 | 471 | 11998 | +0126 
| | | 
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QUEUEING WITH BALKING 


By FRANK A. HAIGHT 
Auckland University College, New Zealand 


1. INTRODUCTION 


In dealing with problems of queueing, several writers (Kolmogoroff, 1932; Erlang*; 
Kendall, 1951, 1953; Lindley, 1952; Takacs, 1955) have discussed the situation where 
queue stability is obtained by assuming that the demand for service does not overload the 
service mechanism. Thus A, the average number of arrivals per unit time, is assumed to be 
less than ju, the average number of departures per unit time, so that their ratio, p, is less 
than unity. Kawata (1955), on the other hand, has shown that queue stability can also be 
obtained by assuming that, although arrivals occur more frequently than departures, some 
arrivals choose not to join the queue. In theory this case can be included in the original 
one, simply by supposing A to be computed only from the values provided by those who 
actually join the queue. However, if the decision to join or not depends on some random 
variables, it is sensible to inquire into the relationship between these variables and those 
which characterize the queue, such as queue length and waiting time. 

The factors which influence the decision of a person to join a queue or not may be con- 
sidered under two general headings: (a) those relating to the importance of being served, 
and (6) those relating to the obstacle which the queue presents, namely the waiting time 
which he must experience. It is obvious that the waiting time cannot be found without 
a knowledge of the service times of all those in the queue; we shall, therefore, make the 
simplifying assumption that the individual measures the obstacle presented by the queue 
by its length when he arrives, which will be denoted by k(é). 

The factors included in (a) may be much more complicated, and may produce an opinion 
ranging from absolute urgency, so that a queue of arbitrary length will be joined, to 
absolute indifference, so that no non-zero queue will be joined. It will be assumed that these 
factors have been weighed by the individual before he arrives, and have produced in his 
mind an integer K, which is the greatest queue length that he will tolerate. If then he 
observes k(t) <K on arrival, he joins the queue; but if k>K, he goes away and does not 
return. We assume that the values K chosen by the various individual arrivals may be 
regarded as being random samples from a certain distribution—the balking distribution. 
We shall sometimes allow K to be infinite with non-zero probability, so that a finite pro- 
portion of the arrivals will join any queue. In this paper we assume that if a person joins 
a queue, he must remain for service. In another paper we will permit him to test sequenti- 
ally whether to stay or go, while he waits. 

While the principal interest here is p >1, it should be noted that the classic queueing 
cases are included, and that the appropriate results can be obtained by letting K>« 
when p< 1. 

These results are (implicitly) special cases of a general set-up considered by Kendall & 
Reuter (1957). We derive them by assuming equilibrium, and investigate them numerically 
for various balking distributions. 


* See Brockmeyer et al. (1948). 
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2. DIFFERENTIAL EQUATIONS 

Suppose both arrivals and departures occur as events of homogeneous Poisson processes, 
of density A, w, respectively. Let 

P(x,t) = Pr(k<-a at timet), F(x) = Pr(K<z), 

Q(x, t) = 1—P(z, t), G(x) = 1— F(x), 

p(x,t) = P(z,t)—P(x—-1,t), f(x) = F(x)—F(x-1). 

Note that in general p(0,t) = P(0,t)+0; this is the probability that no queue exists at 

time ¢. Also f(0) = F(0)+0; this is the probability that an individual is absolutely queue- 


resistant. The distribution of K will be called the balking distribution. 
Given a queue of length x, then the probability that an arrival joins it 


= Pr (his balking value > x) 
= G(x-1). 
The build-up of differential equations, following Feller, is 
p(x, t+ At) = [1—(A+ pz) At] p(x, t) + wp(at+ 1, t)At 
+Ap(x—1,t) G(a— 2)At + Ap(x, t) F(a — 1)At + O(At)?, 
where the four contributions on the right-hand side are from 


(a) no arrivals or leavers in Af, 
(6) one leaver in At, 
(c) one arrival in At who joins, 


(d) one arrival in At who balks, 
respectively. 


Take p(x, t) from both sides, divide by At, and take the limit 
(0/0t) p(x, t) = —(A+p) p(x, t) + wp(x+ 1, t) +Ap(x—1, t) G(a— 2) +Ap(z, t) F(x—1), 
where, if « = 0, we delete y in the first term; F(—1) = 0 = F(—2). Add over 2, giving 
(0/ot) P(x, t) = wp(x+ 1, t) —Ap(x, t) G(x—1). (1) 


Writing AG(x) = A,, (1) can be reduced to a form similar to the equation for a birth-and- 
death process given by Feller (1950), but differing both with respect to one subscript and 
with respect to the initial equation, namely 


(0/et) P(x, t) = A,_, P(w—1,t)-—(w+A,_,) P(x, t)+nP(x+ 1, ¢). 


If, however, each person must join the queue, so that F(x) = 0 for all x, then (1) becomes 
a special case of the birth-and-death equation. 

Using a method suggested by Koopman (1953) we can give a method for computing 
solutions of (1) in cases where they exist. Let ¢(2,s) be the Laplace transformation of 
p(x, t). Transforming, (1) becomes 


—AG(a— 2) (e— 1, 8) + (w+ AG(e—1) +n) G(x, 8) — wp lw + 1,8) = 8, 
where #(— 2,8) = ¢(—1,8) = 0, 6, = 1 for x = 0 and 0 otherwise, thus assuming the queue 


empty at ¢ = 0. Writing (we, 8) = $(x, 8)/6(x—1,8) 
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and using the equations corresponding to x = 1, 2,..., it is seen that (1, s) may be written 
as a continued fraction A AWG(0) AuG(1) 

ana: B, ai B; tees 
Substituting this value back into the equation for x = 0, we find (0,8), and hence each 


¢(x,s). Values of p(z,t) may then be found by the numerical inversion of the Laplace 
transform. 





3. EQUILIBRIUM DISTRIBUTIONS 


If equilibrium distributions of queue length exist as t+ 00, they will be denoted by sup- 
pressing the letter ¢, and can be found from (1) by setting the left side equal to zero 


pil) =pp(0), p(x+1)=pG(x—1)p(z) (x =1,2,...). (2) 
Defining C=¢,=1, ¢, ="T10) (x = 2,3, ...) 
this can be written p(x) = p*c,,p(0). (3) 
Summing, l= p(0)% p*Cy. 


Kawata (1955) and Kendall & Reuter (1957) have shown that the convergence of the 


«o 
series } 0*c, is necessary and sufficient for equilibrium to be attained, replacing the classical 
0 


condition p < 1 (which is contained); the condition is that (0) > 0. 

Thus, the tails of the balking distribution are proportional to the ratio of ordinates of 
the queue length distribution, and when p is known, either may be computed from the 
other. Some examples will be given in § 5. 

From (2), we have (if p(x), p(w+1)+0) 





sa _ P(O) (p(e@+1)_ p(e+2) 
Fle) = ke—1) Ble) = Fa Oy pte) (a 
so we must have p(x +1) > p(x) p(x+2). (5) 


If p(n) + 0 but p(n+ 1) = 0, then from (2) we have G(n—1) = 0 and so p(x) = 0 for all a>n. 
This is the case where there is complete balking for queues of length m or more. Any finite 
distribution of queue-length satisfying (5) is attainable in this way; the condition for 
equilibrium will be satisfied for all p. 

Any infinite distribution of queue-length (satisfying (5)) will correspond to a balking 
distribution (from (4)); there will be a positive probability that K is infinite (i.e. G(0oo) +0) 


unless 
. p(x +1) 
lim = 0. 6 
z>o p(x) (6) 
If (6) is not satisfied, there is an upper limit to p for equilibrium to be attained. We have 
c, ~ A(G(co))*, so Xp*c,, converges only if 
1 , p(x) 
~— = li ; 
G00) ~ pro P(e + 1) 
The reverse problem, of finding p(x) when f(x) is given can be solved, at least numerically, 
in every case. However, there are some algebraic difficulties in the three steps of an 


analytical solution: (a) expressing pG(x) in a convenient form, (b) evaluation of c,, and 
(c) summing the series Xp*c,. 
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4. GENERATING FUNCTIONS 


Let n(8,t) = % sp(2,1) = (1-8) 3 #*P(a,!), 


&(s,t) = 3% sp(0+ 1, t) F(x). 


Summing (1) after multiplication by s*, we obtain 


r) 
(hee) = -av00.t+ (4 -a) (nie, 200.0) +Ask(s,0. 


Letting too with the usual change of notation, we have (assuming equilibrium) 


7(8) — psy(s) — p(0) + ps*&(s) = 0. (7) 
With s = 1, (0) = p(0) = 1—p +p&(1) 
and since 0< p(0) <1, eo" <E(1) <1. 


It will be noted that £(1) is just the (asymptotic) probability that an individual balks. 
Recalling that p was calculated for all arrivals, whether they joined or not, it can now be 
seen that an effective value of the traffic intensity, computed only from those who join 
the queue, say p’, can be written p’ = p—pé&(1). 


Writing now p,(x), 7,(x), m, (mean queue length), etc., to denote the dependence on p, 
we have from (3) 





ngl8) = 2,(0) 3 5*0%Ce = 3(05)* ce [3 p*e = P,(0)1P lO), (8) 
; _ p,(9){, 1—ps 
and using (7) £ (8) = 8 f aa (9) 


Thus the queue length and balking distributions are uniquely determined once p,(0) is 
known as a function of p. 
We have from (3) 
0 0 a 
ae ee = =2- 10 
P 5, oe Ppl) p 5, (loge. + log log Sp | r—™M, (10) 
which essentially determines p(x) (and in particular p,(0)) when m, is known (as a function 


of p). Also, the variance of the queue length, 


C) roe) rs) rs) 
v= % Pp(x) (x—m,)* = 3 (v@—m,) P=, Ppl) -?.* (11) 
(since Zp,(x) = 1). 
7,(8) and &,(s) each satisfy a simple partial differential equation; from (8) we have 


log 7,(8) = log 7,(0) + 91(p8) 
0 7) 0 
whence (5, -s5) log (8) = P 5,108 7,10) =—-M, 
Also from (9) log £,(s) = log 7,(0) — log ps + go(ps), 
7) 0 ee. 
(5, ~ 85.) log é,(s) =—-m,—1+2=1-m,, 


where g,( ) and g,( ) are used to denote ‘a function of’ and ‘another function of’ in the 
above equations. 
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Differentiating (7) and setting s = 1 gives 


my = 7h {1 —Ep(1) — 26 (1) (12) 
Differentiating (7) twice and setting s = 1 gives 


+* appl — plEP(1) + 482(1) + 48 (1) £5 (1)] — (1 — p) (£51) + 5, (1) + 46,(1)]}- (13) 


5. SOME EXAMPLES 
(i) Binomial queue 
Condition (5) is satisfied by the finite distribution 


p(x) = "C,n*(1—7m)"-* (x =0,1,...,n), 
where 0<7 <1. We find 


nT np 
p= 7—, 0<p<o@; m,=——-—>nN as p>. 
1 n+p 


The corresponding balking distribution is given by 





el 0<x<n-l st a oe 
Gx) = m(et2) ~~? tgp = 2m (+1) (@+2) 


0 NK, 0 N<gx 


Q<a<n-1 


and is independent of 7. As poo, 7-1 and the queue is almost always of length justn. | 
The relations (8) to (11) may be verified immediately: 


260=(se5) > WO = (Ree) > Plea) 


(ii) Negative binomial queue 
An example of an infinite queue-length distribution is 


p(x) = Nt2-10Cy_ X*(14+ x) = (x = 0,1, 2,...), 
where ¥>0 and N > 1. We have 








. p(x+1) 1 
li = G(oo) = —, 
=m 8 
_ Nx : _ pe 
ad 0<p<N; a eel Ts as poN. 
The balking distribution is 
N+a+1 N-1 1 
G(x) = Ma+2)’ f(@) =" Ft) wetd) (x = 0,1, 2,...), 


with a probability 1/N at 2 = oo. The balking distribution is independent of y: 


P,(0) = a). Ie(8) = (v=,) » ai (v=): 


We 
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(iii) Poisson queue 


Each of the queue distributions in (i) and (ii) approaches the Poisson form as n, N +00: 
pre? bb 
Pz) = (oO: 2, 2...) 
We now have G(0o) = 0, so 0<p<o; m, = p = v,. The balking distribution becomes 
e Aicss chet 
(w+ 1)(%+2) 


which may be regarded as a discrete analogue of the Cauchy distribution 


G(x) = 


f(x) (x = 0,1, 2,...) 


z+2° 


p,(9) =e?, 7 ,(8) = ePtps, 
(iv) Type III ordinates 
Another possible infinite queue-length distribution is 
p(x) = A(x+ayre** (x= 0,1,2,...), 


where a, v, A> 0; A is a complicated function of a, v and A. Here 


a \’ a+l1\” a+l1\" 
aco) = (£5) : p=-(+) e-, so 0<p<(**") . 


The balking distribution is 


” a(a+x+2) | = 
G(x) = lenis (2 = 0, 1,2, ...) 


which is independent of A. For p near (a+ 1)’/a’”, we have approximately 


v+l1 ~ (mp, +a)? 


ys & (: =a eon” 
log} — | —— 
p\ a 


If we let v-> 0 we obtain the classic case where every arrival must join the queue 





—Az 


pz) = 7—, Goo)=1, p=e*, O<p<l. 


(v) Normal ordinates 


A very simple result is obtained when we assume that the queue-length distribution is 


(x —m)?* 


p(x) = Aexp— 





(eo = 0, 1,2, ...), 
where m and v are very nearly the mean and variance of the distribution if v and m?/v are 
‘large’, say both > 9. We find 
G(wo) = 0, p=exp(m—})/v, 0<p<o. 
The balking distribution is Pascal (geometric) 
G(x) = exp—(%+1)/v, f(x) =(1-A)A* (x =0,1,2,...), (14) 
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where A = expv-!, and is independent of m. Denoting the mean of t . distribution by 
M = A/(1—A), we have 
M+1 


= 

m,~m = $+vlogp, 0x0 = [log (“r-)| . (15) 

The relation m, = const.+,logp cannot hold exactly for any distribution whatever, 
as from (11) it implies first », = c = const. and then m, < 0 for p< exp —(¢/v). 


(vi) Deterministic balking 


One case in which these calculations are easy is the following: each person possesses 
exactly the same degree of queue resistance, i.e. the balking distribution is deterministic, 
f(x) = 1 for x = K and zero otherwise. Although rather trivial, this case will be important 
in an application to be mentioned subsequently. We have 


1 O0<x<K-—1, 1 O0<a<K+1, 
)= =| 





0 K<z, 0 K+2<z2, 
1—p 
he p® O<x<K+1, 
80 PO) = Seas (a) = | 10x 
0 K+2<2. 


From (8) and (9), or direct from the definitions, we have 


1 —(ps)K+2 








l-p e l-p 
£,(8) - 1 pent ten, 1 (8) ia 1— pk 1—ps 


Hence, from (12) and (13), or from (10) and (11), we have 


K+2 


P 1— 1 — pK+2’ 
es ae gy2__ Pa? 
6 = py +) pa 


In the trivial situation where each individual insists on immediate service, K = 0, and 


a — . 
~“t "aa. 





In the classic situation, in which each individual must join the queue, K = 00, and 


NR p 


oe T=p? (l=)? 
Each of (i) to (v) above has led to a balking distribution which is unimodal with mode at 
x = 0. It seems very difficult to obtain explicit results for a balking distribution with mode 
not zero; numerical calculations were carried out for p = 2, 3, 4, 5, 6, 7, for each of the 
following balking distributions: 
(vii) Poisson with mean 10. 
(viii) Ordinates of a lognormal distribution f(y) where 5(logy—1) is a unit normal 
variable. 
(ix) Ordinates of a x? distribution with 10 degrees of freedom. 
(x) The Pascal distribution (14) with A = 10/11. 
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Table 1 shows the means and variances of the queue-length distributions obtained, 
together with those of the ‘input’ balking distributions. Also, in the last column, are given 
the approximate values for the Pascal distribution (x) obtained from (15). It will be seen 
that the approximation is very close except when p = 2; here m?/v is only-5-5. 




















es Se ek | 
| Poisson Lognormal i Pascal | Formulae (15) 
| } 
sans ee | ——— iscshlaebieesaliia a — = ————— — 
| Input | 
| Mean | 10 10-57 9-498 10 | 10 
| Variance | 10 19-5384 | 19-23 110 | 110 
| | | 
| p=2 | | 
Mean | 10:320206 10-212608 | 9-475843 2 7823407 | 7:772564 
Variance 4-908937 | 6-506500 | 6-110196 9-726074 10-49205 
p=3 | 
| Mean 11-941052 | 12-578738 11-633753 12026812 | 12-026641 
Variance 3351061 5382238 4-725268 10-467543 | 10-49205 
p=4 | | 
Mean 12-826379 14078101 | 12-914084 15-041116 | 15-045024 
Variance 2-844219 5-074961 4-248282 10-490947 | 10-49205 
| ome 
Mean 13-430370 | 15-192921 | 13-830718 17-363218 | 17-386325 
Variance | 2-582158 4-929776 | 4-032884 10-463681 | 10-49205 
} Mean | 13-885518 16-086535 | 14-547050 19-284075 | 19-299236 
| Variance 2-416736 4-858583 | 3-839912 10-533106 | 10-49205 
| | } 
| os? | | 
Mean | 14248864 | 16831880 | 15-130198 20-919959 | 20-916585 
Variance | 2-301017 | 4-813526 3-728680 10-485757 10-49205 





6. SOME GENERALIZATIONS 


An interesting application of queueing with balking is furnished by the problem of a 
sequence of transporting mechanisms which move discrete units of cargo. In the termino- 
logy developed by the Department of Engineering of the University of California at Los 
Angeles (1953), each transporting agency constitutes a ‘link’, and the place where two such 
mechanisms transfer their loads is a ‘node’. If there is room at a node for the storage of S 
items, the number in storage at time t may be regarded as a queue of length 0,1,...,8. If 
the link setting down items carries A, units at a time, and the link picking up items carries 
A, at a time, then we must assume S > A,+A,—1, to prevent the process stopping alto- 
gether. The time required for the journeys of the links (in which we absorb the time required 
for pick up or set down) constitute the inter-arrival and inter-departure times of the queue 
at the node between them. If the storage k(t) at the node is such that the arriving link 
cannot set down its cargo, i.e. if S—k(t)<.A,, the link goes back and arrives again after 
another inter-arrival time. If the storage when removing link arrives satisfies k(t) < A,, so 
that a full load is not available, it also goes back and forth until the inequality is reversed. 

Thus, considering only two links separated by one node, the following generalizations 
of the probabilistic queueing model discussed in our introduction are suggested: (a) bulk 
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arrivals and departures, (b) balking distributions associated with both arrivals and de- 
partures, (c) a finite number of states possible. Also, the cargo-handling case requires 
deterministic balking; we shall begin with stochastic balking, and then specialize for 
cargo-handling. 

Let K, and K, be two integers such that when A, items arrive, they are added to the 
queue when k(t) < K, and not otherwise, and when the removal mechanism arrives, it takes 
away A, items if k(t) > K, and not otherwise. Let F(x) = Pr(K;<x) = 1—G;,(z). 

Using the same argument that was employed in § 2, 

p(x,t+ At) = (1—AAt— pAt) p(x, t) 
+ AAt(1 — wAt) | p(2, t) F(x — 1) + p(x — Aj, t) G(x — A, — 1)] 
+ HALL — AAt) [ p(x, t) Ga(x) + p(x + Ag, t) F(x + Ag)] + o(Ad), 
where departures as well as arrivals are now partitioned into two cases. Also, we have 


special cases not only for x = 0, but for x less than A, or A,. However, this can be dealt 
with by conventions regarding negative arguments. We put 


p(x) =0 for «<0 or ax>S8, 

F(x)=0 for x<0, 

Fi(x)=0 for «<0. 
Passing to the limit, and letting t > co (assuming equilibrium) we obtain 

0 = pG,(x— A, —1) p(x— A.) + {pG,(x— 1) + F,(x)} p(x) — R(x + Ag) p(w +t Ag) 
(x = 0, 1,....,8). 
These S + 1 equations are linearly dependent; there is the further relation Xp(x) = 1. The 
system of equations may be written Bp = a, where 
Pp = {p(0), p(1), ...,p(S)}, a= {0,0,..., 1} 


and B is an (S +1) x (S +1) matrix having zeros everywhere except in the last row (where 
all the elements are unity), the principal diagonal, the A,th super-diagonal, and the A,th 
subdiagonal. 


If A, = A, = 1, these equations can be solved exactly as in §3; only a redefinition of 
c,, is needed 


x—-2 ; x ' 
co = T1(3)/ 11 G46) 
In case of arbitrary A, and A,, but with deterministic balking defined by 
K,=S-A,, K,=A, 


(which are the values suggested by the cargo-handling problem), the matrix B simplifies 
as follows: 

(a) the elements in the A,th subdiagonal become —p, 

(6) the elements in the A,th super-diagonal become — 1, 

(c) the principal diagonal consists of — A, elements of value p, followed by 


S—A,—A,+1>0 


elements of value 1 +, and, finally, A, elements of value 1. 
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I have evaluated this determinant only in special cases, but offer the following con- 
jectures about the polynomial B(p; A,, Az, 8): 
(i) it vanishes unless A, and A, are relatively prime; 
(ii) if A, and A, are relatively prime, it consists of S—A,—A,+3 terms, beginning with 
a term of degree S—A,+1 and ending with a term of degree A,—1; 
(iii) B(p; m,n, S) = pSB(p; n,m, 8); 
(iv) the first and last coefficients are A, and Ag, respectively; 


Ss 
(v) B(p; 1,1,8) = ap. 
j= 


I wish to thank Drs F. N. David, D. G. Kendall, C, L. Mallows and R. Bellman for their 
helpful suggestions. 
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TESTING FOR SERIAL CORRELATION IN SYSTEMS OF 
SIMULTANEOUS REGRESSION EQUATIONS 


By J. DURBIN 
Research Techniques Unit, London School of Economics and Political Science 


1. INTRODUCTION 


Much attention has been given in recent years to the problem of fitting economic models and 
this has led to the study of systems of simultaneous regression equations of the form 


Ay = Br+e, (1) 


where y is a vector of jointly dependent variables, x is a vector of predetermined or inde- 
pendent variables, ¢ is a vector of errors, and A and B are matrices of unknown parameters. 
It is usual to assume that A is square and non-singular, that the e’s are random variables with 
zero means and constant variance matrix, and that the x’s are either constants or random 
variables distributed independently of the e’s. Given the model (1) we usually require to 
estimate the elements of A and B from a sample of n observations of y and x and to make 
confidence statements about the estimates. 

In some formulations certain of the x’s coincide with lagged values of the y’s. The theory 
becomes much more complicated in such cases, and we shall not consider them except to 
point out that the results obtained later in the paper, which are exact for the model specified 
above, may be expected to hold approximately for models containing lagged dependent 
variables. 

The specification (1) is a natural generalization of the ordinary linear regression model and 
calls for similar methods of analysis. However, a number of new difficulties present them- 
selves, among which is the identification problem, that is the question whether the parameter 
values can be inferred from a complete knowledge of the distribution of the observations. It 
turns out that this is possible under reasonable assumptions only when certain restrictions 
are obeyed by the elements of A and B. Roughly speaking, when the restrictions are just 
sufficient to enable the parameters of a particular equation to be estimated the equation is 
said to be just-identified, and when the restrictions are more than sufficient the equation is 
said to be over-identified. More detailed explanations are given by Girshick & Haavelmo 
(1947) and Koopmans and others (1950). 

When an equation is identified its parameters can be consistently estimated by the 
reduced-form method or the limited-information method, a treatment of which has been 
given by Anderson & Rubin (1949, 1950). Although some writers take the phrases as 
equivalent, in the present paper we shall use the phrase reduced-form method to refer only to 
estimation in the just-identified case while limited-information method will be taken to 
refer only to estimation in the over-identified case. These methods have the advantage that 
the parameters of the particular equation are estimated independently of the remaining 
parameters of the system. This is convenient since in many investigations we are not 
interested in the estimation of all the parameters of the system but only of those in a specific 
equation. The methods involve a possible loss of efficiency since the restrictions on the 
elements of A and B other than those in the equation under study are ignored in the 
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estimation procedure. With this reservation, when the errors are normally distributed and 
satisfy reasonable assumptions, the estimators so obtained are maximum likelihood 
estimators. 

A method for the simultaneous estimation of all the parameters of the system which 
makes use of all the information available about the restrictions on the elements of A and 
B has been developed by Koopmans and others (1950), but it requires extremely heavy 
computations and has other disadvantages which make it unlikely that the method will be 
found useful in practice. Consequently, it will be assumed in this paper that the equations 
under study have been fitted by reduced-form or limited-information methods. 

These methods are extensions of the method of least squares to the simultaneous equations 
case, and like least squares depend for their validity on the assumption that the errors are 
serially uncorrelated. If this assumption does not hold, the estimators are not efficient and 
confidence regions calculated without taking the serial correlation into account may be 
highly misleading. Tests of serial correlation should therefore form part of any analysis in 
which the observations consist of time series. A small-sample test for single-equation 
models has been given by Durbin & Watson (1950, 1951). The purpose of the present paper 
is to extend this test to cover simultaneous-equation models in which the parameters have 
been fitted by the methods referred to. 

It is, of course, possible to transform (1) by multiplying through by A~ to give the system 


y = ABz+A-e, (2) 


in which each equation contains only a single dependent variable. If the original errors ¢ are 
serially uncorrelated, so are the transformed errors A~¢ and each equation may be fitted 
separately by least squares. Consequently, the test appropriate to single-equation models 
may be applied to each equation separately and in this way it appears that the hypothesis 
that all the e’s are serially uncorrelated may be investigated. In view of this, it may be asked 
why any special consideration need be given to simultaneous-equation models since on the 
face of it the problem can be dealt with by existing methods. 

In the first place there is no method available for combining the results from the separate 
tests into an overall test for all the errors in the model. Secondly, even if an overall test could 
be constructed it would not be suitable for cases in which our attention is focused on a specific 
equation of the model. What is required is a test which has high power against altern itives 
affecting estimators of the parameters in that equation. As will be seen below, for a sample 
of n observations these estimators depend on a vector of n variates which are functions of 
the errors in the equations of (1). The estimation procedure depends on the partitioning of 
this vector into a component lying in a space of r dimensions, called the estimation space, 
r being the number of independent variables in the model, and a component in a space of 
n—r dimensions, called the residual space. On the usual assumptions of normality of the 
errors, the efficiency of the estimates and the validity of confidence statements about them 
depends on the property that these components are randomly directed in their respective 
spaces. The effect of departure from serial independence is to make the distribution of direc- 
tions non-random. Now the estimation component depends on the unknown parameters 
that are being estimated whereas the direction of the residual component is independent of 
unknown parameters. Thus, it seems reasonable to base the test of serial dependence on the 
distribution of directions of the residual component. The statistic used in the tests described 
below measures the direction of this component and is known to have high power against 
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Markoff alternatives. Details of its power properties may be found in the paper by Durbin 
& Watson (1950). 

Reference to a procedure attributed to H. Rubin for testing serial correlation in simul- 
taneous-equation models is made on pp. 70 and 73 of the book by Klein (1950). Details are 
not given, but the method is based on the lag correlations of the complete set of residuals 
from all equations of the system, and so is quite different from the test described in this 
paper. Presumably Rubin’s is a large-sample test, whereas the test described below is valid 
for small samples. 

2. JUST-IDENTIFIED EQUATIONS 


Suppose that the model (1) has p + 1 equations and that we wish to estimate the parameters 
in the first of these equations, which we assume is just-identified. Suppose also that the 
vector x has k +p elements (k>0). It is convenient to write (1) in the transposed form 


y' A’ =2'B’ +e’. (3) 
The model for n observations is then 
YA’ = XB'+E#E, (4) 


where Y, X are nx (p+1) and nx (k+/) matrices of observations and £ is an n x (p+1) 
matrix of errors. The rows of H are assumed to be independent vectors from the same 
multivariate normal population with zero variate means. We consider the usual case in 
which the restrictions on the elements of A = [«,;]and B = [f;,;] which ensure identification 
are &,, = l and 
" Bras = + = Besa = = Brrp.1 = 0. 

Representing column vectors by small letters and matrices by capital letters (4) can then 
be written in the form 


where « and # are px 1 and kx 1 vectors whose elements are to be estimated and 0, is 
a p x 1 vector of zeros. The values of the elements of the vector a, and the matrices A,, Bf 
and B, do not concern us except that B, is required to have rank p. e, is the n x 1 vector of 
errors in the first equation and F, is the n x p matrix of errors in the remaining equations of 
the system. In this notation the n observations of the first equation are written as 


Yit Yea = X,h+e. (6) 


Let X, denote the part of Xf orthogonal to X,, ie. X, = Xf — X,(X,X,)-1X, XF. (5) can 
then be rewritten as 


tv Mal] Gt] = 2 | all 6 pt] + Ea (2) 


For the purpose of developing the test of serial correlation we may without loss of generality 
assume that the columns of X, and X, are normalized and orthogonal, i.e. X; X, and X;X, 
are unit matrices. This assumption is permissible since the test will be based on the residuals 
from the fitted equation and these are unaffected by the assumption. 

The reduced-form estimator a of « is obtained by equating to zero the regression of y, + Y,a 
on X,, i.e. by setting X3(y, + Y,a) = O,, which givesa = —(X,Y,)-1X3y,. The estimator of b 
is given by the regression of y, + Y,a on X,, i.e. b = X}(y,+Y,a). The set of n residuals from 
the fitted equation (6) is then given hy z = y, + Y,a—X,b. 
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As in the single-equation case (Durbin & Watson, 1950, 1951) the test for serial correlation 
is performed by calculating the statistic 








n 
vei —2_,)? 
i= 
dt de ee 
>a 
a” Uf (8) 
2 2’ Dz 
22° 
where D is the matrix 1 -1l 0: ssc 07. 
i £ «8 : 
os Se 
: —] 2-1 
0... 0-1 1 


We shall show that in the usual case in which one of the columns of X, is the vector of 1’s 
1 








: , the significance of d may be tested by entering Durbin & Watson’s (1951) tables with 
1 
kK’ =k+p-1. 
Post-multiplying (4) by (A’)-1 we have 
Y= XC+F, 
or, in the form corresponding to (7), 
[yr ? Yo] = [X, : X,]C+(f, : Hl. (9) 
Substituting in z2=y,+Y,a—X,), 
we have z= X,¢,+X,ce,+f, 
where c, and c, are given vectors depending on a, b and f = f, + Fya. 
Xj 
Let K be an n—k—p xn matrix such that H = | X}| is an orthogonal matrix, and let 
£ = Hz. Then K 
Xj 
& = | X23] [X1¢,+ X2e+f] 
K 
_ On+p 
U > 


where w is the n—k—p vector Kf. 

Let a set of independent normal variates (vectors) with zero means and constant variance 
(variance matrix) be called JN ZC variates (vectors). Since the rows of Z are INZC vectors, 
so are the rows of F’, and since H is orthogonal so are the rows of HF. 

Consider the distribution of u conditional on the first k +p rows of HF being held fixed. 
The estimating equation for a is X3(y, + Y,a) = O,, which reduces to c, + X}(f,+ Fa) = O,. 
Thus the value of a depends only on the first k + p rows of HF, and is therefore conditionally 
fixed. Consequently u behaves conditionally as if a were a fixed quantity. Now u depends 
only on aand the last n —k—prows of HF. Since the rows of HF are INZC vectors of whose 
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elements wu is a linear combination, it follows that u is conditionally a vector of n—k—p 
INZC variates. 

Since £ = Hz, we have z = H’é. Substituting in (8) we have 
&’HDH't 
g’é 
u'Lu 


wu 


d= 





> 


where L is ann—k—pxn—k—p matrix. 


Let N be the orthogonal matrix diagonalizing JL, i.e., 
Vy 
N’LN = 


Ve 


Vn—k—p 


the blank spaces representing zeros. Let 7 = N’u, i.e. u = Ny. Then 


due i (10) 





where 7;, ...,;n—,~—p are conditionally JNZC variates. Denote their common (conditional) 
variance by o? and let ¢; = 9,/0 (¢ = 1,...,.n—k—p). Then ¢,,...,¢,_,-» are independent 
N(0, 1) variates. Substituting in (10) we have 


d=) _, (11) 





Now the JNZC property of £,, ...,&,-,-p depended on the restrictions imposed on the 
first p rows of HF, whereas ¢,,...,¢,_,_, are independent N(0, 1) independently of these 
restrictions. We may therefore relax the restrictions and the distribution of d defined by (11) 
will hold unconditionally. 

d is now in the form considered for the single-equation case by Durbin & Watson (1950, 
especially pp. 411-13, and 1951). It is shown in these papers that d; <d<d,,, where 


= A; 3 2% Ansp+iS3 
d, = es _ and dy = — aneeny — ’ 
x x oF 


A, <Ag<... <A, being the latent roots of D. Significance points of d; and dy were tabulated 
by Durbin & Watson and in their notation k+p—1 = k’. It may be noted that in each case 
k’ is the number of constants fitted in addition to the mean. For a test against positive serial 
correlation an observed d less than the tabulated value of d; is declared significant, while an 
observed value greater than the tabulated d,, is declared non-significant. For intermediate 
values of d the test is inconclusive; however, an approximate procedure to deal with this 
contingency was given by Durbin & Watson and this applies equally to the present case. The 
method requires the calculation of the first two moments of d in terms of the matrix X — X of 
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deviations from the means of the k+p—1 independent variables (other than that corre- 
sponding to the mean) in the model. For tests against negative serial correlation and two- 
sided tests the procedures given by Durbin & Watson may be followed. 

The results of this section may be summarized in the following rule: 

When a just-identified equation has been fitted by the reduced-form method a test of 
serial correlation may be performed by calculating 


X (%— 2%)? 
ta. : 





where 2, ...,2, is the set of observed residuals, and entering the tables d,; and d,, (Durbin & 
Watson, 1951) with k’ equal to the number of constants fitted in addition to the mean. 


3. OVER-IDENTIFIED EQUATIONS 


When the number of independent variables with zero coefficients in the first equation is 
greater than p the equation is over-identified. The model for n observations is, as before, 


YA’ = XB’ +H, 
where Y, # are nx p+1 matrices but X is now nxk+q with q>>p. The restrictions on 
B which give rise to over-identification are P,.1.4 = Peso. = --» = Brig = 9: 
Equation (4) can again be written in the form 
: Lia’ , B H B, ; 
i. A) oe eas = ae ee pee aeagatt : 12 
[Ys | ¥al| 3 “| [X, | X4] 5 | +[e, | £,], (12) 


where X = [X,: Xf] and X, = X}— X,(X;X)-! X, Xf. If the columns of X, and X, are not 
already normalized and orthogonal we may transform to new independent variables with 
this property without affecting the values of the residuals on which the serial correlation 
test is to be based. 

The limited-information estimator of a is the vector a such that in the Euclidean repre- 
sentation of the [n—k]-dimensional space orthogonal to the [k] space spanned by the 
columns of X,, the angle between the projection on the [n — k] space and the vector y, + Y,a 
and the [qg] space spanned by the columns of X, is a maximum. It may be noted that in 
the just-identified case, for which g = p, this angle is a right angle; when q > p, the angle will 
in general be less than a right angle. The limited-information estimator b of / is given by the 





regression of y,+y,a on X,, ie. b = X}(y,+Y¥,a). Xi 
Let £ = H(y,+Y,a—X,b), where K is such that H = x is an orthogonal matrix. 
O;, K 
Then § = | , where v = X3(y, + Y,@) is the projection of the residual vector z on the X, 
u 


spaceandu = K(f,+F,a) where, asin § 2, therowsof F = [f,, F,] = H(A’)-are 1 NZC vectors. 
We choose a to mimimize (v’v/u’u). As a function of the errors v depends only on the ele- 
ments of X;F and u’u depends only on the elements of the p+ 1x p+1 matrix F’K'KF. 
Thus, a depends only on the elements of X;F and F’K'KF. 
Now the rows of KF are IN ZC vectors independent of the other rows of HF’. Hence, their 
joint density depends only on the elements of the matrix of sums of squares and products, 
namely F’K’KF. Consequently, all samples for which F’ K' KF is held constant are equally 
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likely. Now the elements of F’K'KF fix the lengths and relative inclinations of the vectors 
constituting the columns of KF. Regarding thes¢ vectors as a set of axes, if F’K’K F is fixed 
the configuration of the axes is fixed, but all orientations of the axes in the n — k — q space are 
equally likely. Consequently, any vector whose orientation is fixed relative to the axes is 
randomly directed in the n—k—g space. 

Consider now the distribution of uw conditional on the X3F and F’K'KF held fixed. Then 
a is a fixed vector, so that wu = K(f,+ F,a) is conditionally a fixed linear combination of the 
columns of KF. It follows that u is randomly directed in the n —k—q space, i.e. the joint 
distribution of the elements of wu is spherically symmetric. 

Let 6, denote the vector of least-squares coefficients of the regression of y, + Y,a on X,, 
ie. b, = X3(y,+Y¥,a), and let z* = y, + Y,a—X,b—X,b,. Then z* denotes the set of residuals 
from the regression of y, + Y,a on X, and X,. The test for serial correlation is performed by 
calculating the statistic 


n 
PAG —2_,)* 





d= = me; 
> 27? 
i=1 
Let £* = Hz*. Then £* = ew! As in (10), 
n—k-—@q 
x NF 
t=1 
ie = =" geet 
x 4 
i=1 


where 9 = N’u, N’ being an orthogonal matrix. 
Consider the conditional distribution of d for fixed X,F and F’K’KF. Since N’ is ortho- 
gonal and points wu are equi-probable on the sphere u’u = constant, points 7 are equi-probable 


n—k—q 
on the sphere 4’) = constant. Put 7¥ = 7,/(y'7)!. Thend = > v,*?, where points 7* are 
i=1 


equi-probable on the unit sphere. Since yj, ...,7*_,_, are independent of X;F and F’K'KF, 
the unconditional distribution of d when X;F and F’K'KF are allowed to vary is identical 
with the conditional distribution when these quantities are held fixed. 

Let w be a x” variate with n — k—gq degrees of freedom independent of yf, ...,4%_,~q, and 
let €, = whyf (i = 1,...,.n—k—q). Then 6, ..., Cn—k—q are independent N(0, 1) variates and 


This is in the same form as the expression (11) obtained for the just-identified case. It 
follows that an observed d may be tested by entering Durbin & Watson’s (1951) tables with 
k’ = k+q—1. When the test is inconclusive the same approximate procedure may be used 
as in the single-equation case, based on the n x k+q—1 matrix X — X of deviations from the 
sample means of the k + q— 1 independent variables other than the mean. 

We have assumed in developing the theory of the test that the columns X,, X, are nor- 
malized and orthogonal, but it is not necessary to transform the variables to satisfy this 
condition in order to apply the test in practice. In fact the extra calculations required for the 
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test additional to those necessary for calculating the limited-information estimates, are not 
as onerous as the treatment given above might suggest. 

The usual computing procedure for the limited-information method (Klein, 1953, § 4-3, 
p. 169) requires the calculation of the matrix 

W = Y'Y—Y'X(X'X)“X’Y, 

where Y, X represents matrices of deviations from means of observations of dependent 
variables in the equation and independent variables in the model (whether in the equation 
or not) respectively. Normally the matrix product Y’X(X’X)-1X’ Y is calculated by means 
of a single Doolittle procedure as described by Klein. When a serial-correlation test is 
envisaged it will be more convenient to calculate (X’X)-1X’ Y first and obtain the product 
by multiplication by Y’X. Apart from this the vectors a, 6 of limited-information estimates 
are calculated in the way described by Klein. 


1 . : , 
Let a* denote the p+1 x 1 vector a Then the set of residuals z* on which the test is 


based is the set of residuals from the least-squares regression of Ya* on X, i.e. 
2* = [I—X(X'X)-1X’] Ya*. 


n 
It is worth noting that the sum of squares > z*? is equal to a*’ Wa*. 
i=l 


We may summarize the results of this section in the following rule: When the equation 
y,+Y,« = X,f+e has been fitted by the limited-information method, a test of serial 
correlation may be performed by calculating 


n 
= (2f —zf_,)* 
i=2 


d= 


n if ? 
> 27? 
i=1 


where zj',...,2% is the set of residuals from the multiple regression of y, + Y,a (a being the 
limited-information estimate of «) on all the independent variables of the system, whether or 
not they occur in the fitted equation with non-zero coefficients, and referring to Durbin & 
Watson’s (1951) Table 1, with k’ = k+q-1. 

In practice it is unlikely that there will be much difference between the value of d calculated 
from the residuals z and that calculated from the residuals z*. In fact the author conjectures 
that the bounds test applies to the former value as well as to the latter, although he has been 
unable to prove this. Thus, ifit is wished to avoid the extra labour of calculating the residuals 
z*,itseems that a good approximate test could be obtained by calculating d from the residuals 
zand referring to Durbin & Watson’s table with k’ = k+q-1. 
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HETEROGENEOUS ERROR VARIANCES IN 
SPLIT-PLOT EXPERIMENTS 
By R. N. CURNOW The m 
Agricultural Research Council Unit of Statistics, University of Aberdeen 
SUMMARY. In a split-plot experiment, the common assumption is that the same error variance and 
applies to all subplot treatments. This paper is concerned with tests of significance for the departure This 
from equality of the variances for different subplot treatments, and with the estimation of the ratio 
of a pair of such variances. The methods used may be regarded as extensions of those described by contra 
Morgan (1939) and Pitman (1939) in connexion with the problem of comparing the variances of two plots; 
correlated and normally distributed measurements. In particular, Pitman’s method is extended to fi 
provide confidence limits for the variance ratio. as tor 


Suppose that visual acuities were measured, for right eye and left eye separately, on a sample of 
men, and that the possibility of a difference between o? and a}, the variances for the two eyes, was to 
be investigated. If the distribution of the original measurements is bivariate normal, then so is that of 
their sum and their difference. Morgan and Pitman noted that the correlation coefficient in the latter Define 
distribution depends upon the difference (o2—«?), being zero if and only if o2 = 07. Consequently, 
they proposed to test the hypothesis of equal variance by the standard significance test for a sample 
correlation coefficient. In an agricultural experiment of split-plot design the variance for a subplot 


may depend upon the treatment it receives. In particular this is liable to happen if one subplot Furth 
treatment increases the yields by a considerable amount. The simple procedure for testing the equality the ar 
of two variances outlined above would allow for the correlation between the subplots of each whole- 
plot, but is no longer applicable, because of treatment and block classifications. The more general en 
methods of this paper enable tests of significance and confidence limits to be calculated. a(c—1 
An example of a simple split-plot experiment is given, with a discussion of the standard errors for of tree 
the various treatment comparisons, 
1. INTRODUCTION 
The simplest type of split-plot experiment consists of c blocks of a whole-plots, with each 
whole-plot split into 6 subplots. Denote the a whole-plot treatments by A,,...,A, and Ove 
the b subplot treatments by B,, ..., B,. The usual model on which the analysis is based takes | 
(¢ = 1,...,@), 
Yije = HAS RAG +B + Vig tin (7 =1,---,6), 
(&k = 1,...,¢), Adj 
P c a b 4 8 
with X= L%= LA, =9; BI 
k=1 i=1 =1 oO 
Wh 
a 
= Vis = 0 all ds 
b . 
and Xvi =9 alla; ar. 
j=1 
, — abl ; Th 
where ¥;,, is the observation in the kth block on the treatment combination A; B,, is the regres 
general mean, 6, the block effect, a; the effect of treatment A,, 2; the effect of treatment hypot 


B, and y;; the interaction of treatments A; and B;. 

The usual assumptions made about the errors are that each ¢;,, is the sum of a whole-plot 
error, with zero mean and variance o2,, and a subplot error, with zero mean and variance o, 
and further that the whole-plot and subplot errors are uncorrelated. We shall examine the has a 
more general model in which the ¢;,,’s are independently distributed about zero, except 
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for correlations between the e’s for the several subplots of each main plot, and in which the 
e's for each subplot treatment may have a different variance. Thus we take 


E(E%s,) = 93, 
The more usual assumptions mentioned above would have 


Oj = Ty +O8, 


This paper is concerned, in the main, with the particular case 6 = 2. Except when the 
contrary is explicitly stated, it is hereafter assumed that each main plot has only two sub- 


plots; of course, if b > 2 and a particular pair of subplot treatments is of interest, an analysis 
as for b = 2 can be performed for these subplots alone. 


2. A TEST OF THE HYPOTHESIS 0? = a2 

S = Yat Yow 

D = Yak — Yior- 
Further, let Egg, Egp and Epp denote the error sums of squares and sums of products in 
the analysis of covariance of S and D shown in Table 1. The breakdown of the sums of 
squares for S and for D is the usual one for whole-plot totals and differences except that the 


a(c— 1) degrees of freedom for the subplot error have been split into (c — 1) for the interaction 
of treatments B and blocks and a further (a—1)(c—1). 


Define S and D by 


Table 1. Analysis of covariance of S and D 





Sums of squares 


and products 


Components for 








| 
| | 
; | | Components for 
l f df. | d.f. DP 
—" | analysis of D 
| | | 
|» | sp | De | 
| | | 
= o> i ex a ait Den | Pipes 
Adjustment for | | | | | 
general mean 1 -—-}omyto] 1 B 
}— | — | — | a-l AB 


Blocks 
Whole-plot error 


| 
A | a—1 
c—1 — |jo—-}y- c-l ) | 
| e | . 1 
| (a—1)(e—1) | Ess | Esp | Epp | (a—1)(c—1)f Subplct error 
| | | 





Total ac | | ac Total | 





The covariance of S and D, in subplot units, is (03-0). Therefore the F test for 
regression of S on D (or D on 8S) provides a test of the hypothesis 0? = o3. Under the null 
hypothesis o? = 0, and the usual normality assumptions, 

F= _ mip __ 
EssEpp— Esp 
has a F distribution with 1 and m degrees of freedom where 


m = (a—1)(c—1)—-1. 
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3. CONFIDENCE LIMITS FOR 03/03 


Confidence limits for o3/03 can be obtained by an extension of the method due to Pitman. 
Instead of the covariance analysis of S and D, consider the similar covariance analysis of 


wm Yirk , Yirk | 
0, Ge 

and D’ se Yitk Yiek 
0, 


Let the error sums of squares and products be denoted by Eys, Hyp and E’yp. S’ and D’ 
are uncorrelated and therefore the variance ratio for testing the regression coefficient of 
S’ on D’ against zero has a F distribution with 1 and m degrees of freedom, where m is again 
ac—a—c. Thus 





in a) (1) 


has a ¢ distribution with m degrees of freedom. 


Now ae S+D = 
20, 20% 
and therefore 


Br,, — Esst+#ppt Hsp , Hss+Epp— 2H sy , Hss—Epp _ Hy , Be, r(Hy Es) 
” 40? 40% 2010 40? 402° 20,0, 





> 


where 3Z, and 32, are the error sums of squares and }r ,/(H, Z,) is the error sum of products 
in the covariance analysis of y;,;, and Y;o,. 

[Of course Z,, ZH, and r could be obtained directly from a covariance analysis of y;,, and 
Yi2z, but equality of o? and of could not be tested by regarding H,/Z, as an ordinary variance 
ratio because of the correlation between the two subplots in each main plot. When b = 2 it 
seems more convenient to extend the usual analyses of variance of S and D into the co- 
variance analysis of Table 1. On the other hand, if b > 2, a test of equality of two particular 
oj, and confidence limits for their ratio, may perhaps be obtained more readily by forming 
the analysis of covariance of the appropriate y;;,, y;;, and taking Z,, ZH, and r from this.] 

We similarly have 





E, _ Fy, By _y(B, Bs) 
DD 403 403 (20,0, ’ 
,  #, &#, 
and Esp = 753 a3" 
By substitution in (1) 
£, 4H, 
7 im anal 


If the « percentage point of the ¢ distribution with m degrees of freedom is written 7 then 
Pr(—t<t< +7) =1-a. 


Rearrangement of the inequalities leads to the final result that 


Pr EB (K —,/(K?-1))< atloh< Bi (K + 4(K* 1)| = 1-a, 





where 
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9(] — 2) 72 
where K=1 Pas 2 ; 
m 
E, = Egg + Epp+ 2Hs,, 
BE, = Eggt+ Epp—2E sp, 
(Ess— Epp)” 
ee cal) | 
<i 4 (Eg3+ Epp)? —4E Sp’ 
m = (a—1)(c—1)-1, 
and T = a percentage point of the ¢ distribution with m degrees of freedom. 


4, OTHER SPLIT-PLOT DESIGNS 


If the whole-plot design is more complicated than simple randomized blocks (e.g. a Latin 
square), an exactly analogous covariance analysis to that given above will lead to the test 
of equality for the subplot variances and to the confidence limits for their ratio. 

A further application is to Graeco-Latin squares in which the Greek and Latin letters 
represent two different sets of treatments, applied separately to the two halves of each 
whole-plot (Yates, 1937, § 167). The errors associated with the two particular treatments 
occurring together in any whole-plot may be correlated and have different’ variances. 
Extending the analyses of variances of sums and differences (Yates, 1937) into a covariance 
analysis, we have, for a p x p square, Table 2. 


Table 2. Analysis of covariance of sums and differences for a Graeco-Latin square 


| | 
| | 
| d.f. | 





| Sums of squares 


and products 


Components for | af. Components for | 





























analysis of S —_— analysis of D | 

| | st | sp | pe | | 

es al | | eens, ay Maen, See __ 

| | | 

| Adjustment for 1 —_\|—- — 1 Latin v. Greek 

general mean 

| Latin treatments p-l — — — p-|l Latin treatments | 

| Greek treatments p-l — — — p-l Greek treatments | 

| Rows | p-l ~- — “= p-l \ | 

Columns p-l — — —_— p-l Subplot error 
Whole-plot error (p—1)(p—3) | Ess | Esp | Epp | (p—- 1)(p—3)] 

Total Pp Pp? Total | 

a 








The extension of the covariance analysis to more than one square would follow the usual 
lines. The tests of significance and determination of confidence limits proceed exactly as 
above for the simpler type of split-plot design. 
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5. EXAMPLE 


In an experiment to compare the yields of six strains of cocksfoot (Dactylis glomerata), the 
strains were allocated at random to six whole-plots in each of four replicates. Each whole- 
plot was split into two for the comparison of high and low levels of fertilizer. Yields of grass 
from the first cut were about twice as much at the high level of fertilizer as at the low, and 
it was therefore suspected that the error variances for these two levels might be different. 

The analysis of covariance is given in Table 3 and the test for the regression of S on D 
in Table 4. The value for F shows that o? and o3 differ significantly at the 5 % level. 


Table 3. Analysis of covariance of S and D 





Sums of squares and products 


Components for | Components for| 





analysis of S Ee a etree df. analysis of D | 

| | S? SD D? | | 

: alla; Scthneoe | ee ee ee ee | “a 

| | | eo | 

Adjustment for | | | | | 

general mean | 1 293,594 | 128,418 56,170 be | Fertilizer 

Strains a 30,784 | —2,152 | 18,399 | 65 | Interaction 

Replicates | 3 16,038 922 | 585 TT eee oe 

| Whole-plot error | 15 | 37,618=Ess | 19,817=Esp | 39,330=Epp | 15f | “UP | 


| | | | } 
| 











Total | 24 | 378,034 | 147,005 | 114,484 
| | 
Table 4. Test for regression of S on D 
df. | S.S. ee F 
Siig aA Zz re wey 
Regression of S on D 1 9,985 9,985 | 5-06 | 
Residual 14 27,633 | 1,974 — 
Whole-plot error 15 | 37,618 a --- 
Now E, = Bygt+Eppt+ 2E sp = 116,582, 


E, = Eggt+ Epp — 2E gp = 37,314, 
E,/EB, = 31244, 


(Ess— Epp)? 
72 = “SS = 0-0007. 
me : 
For 95 % confidence limits, 7 is the 5°% point of the ¢ distribution with m = 14 d.f. Thus 
T = 2-145 giving K = 1-657 and ,/(K?—1) = 1-321. Therefore 95° confidence limits for 
o?/o3 are (1-05, 9-30). 
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Standard errors for various treatment comparisons 


Let LE, and E, be the whole-plot and subplot error mean squares based on 15 and 18 d.f., 
respectively. Then, if the error variances for the two subplot treatments are assumed 
equal, the standard errors for the various treatment comparisons are as follows: 

Difference between two strain means /(}#,) = 25-0. 

Difference between the two fertilizer means ,/(H,/12) = 13-6. 

Difference between the two fertilizer means for a particular strain ,/(4#,) = 33-3. 

Difference between two strain means at the same fertilizer level ,/[}(Z,+H,)] = 34-4. 

If the subplot error variances are different, the last standard error needs to be recon- 
sidered. The separate analysis of the two levels leads to exact ¢ tests with 15d.f. and the 
standard errors are, for the high level, 








V(Esst+Eppt+2E sp) _ : 
a1 = 44-1, 
and for the low level 
V(Ess+Epp—2Esp) _ 54. 
+ 2/15 = 24-9, 


More precise estimates of the standard errors are 


oa eee So) = 41-9 and 3 Ge) = 24-6, 








2 15° 18 15 18 


where E7,p and Hp are the error sum of squares and sum of products for the differences 
based on all 18d.f. The cost of including the extra 3d.f. is that exact ¢ tests are no longer 
possible. 


I am indebted to Mr G. J. F. Copeman of the North of Scotland College of Agriculture for 
permission to use part of the results of an experiment on strains of cocksfoot in the numerical 
example. 
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A MAXIMUM-MINIMUM PROBLEM RELATED TO STATISTICAL 
DISTRIBUTIONS IN TWO DIMENSIONS 


By A. J. HARRIS 


Road Research Laboratory, Department of Scientific and Industrial Research 


1. STATEMENT OF PROBLEM 


We consider the following problem, and some developments of it. In its finite form it may 
be stated thus. A mass is distributed among the hk cells of a rectangle which has been divided 
into h rows and k columns. The content of each row and of each column is known, but the 
contents of the elementary cells are unknown. What is the least (or greatest) possible content 
of any given group of cells? 

In the infinite form we have a density distribution f(x,y) over the rectangle 0<2<a, 
0<y<b. The group of cells is replaced by a given area which we call Region 1 or #1, the 
remainder of the rectangle being R2. Since the maximum content of R1 can be found by 
finding the minimum content of 2 2 we confine ourselves to minima. We require m, where 


m= min| | f(x, y) da dy (1) 
R1 
subject to f(x,y) > 9, 


b fx 
[ [ fe.maedy = A(z), és 


Vy fa 
[ I f(x,y) dady = Bly), | 
where A(x) and B(y) are given monotonic functions increasing from O to A(a) = B(b) = N, 
N being the total mass. It may also be necessary to assume that A(2) and B(y) possess finite 
derivatives at every point. 

It is convenient, as in Fig. 1, to take the origin of co-ordinates at the top left-hand corner 
of the rectangle, and to take the x axis to the right and the y axis down. We shall use the 
A(x) and B(y) notation, even in the finite form of the problem, to denote the total content of 
the columns to the left of the point («, y), and the rows above it, respectively. 

The problem in its finite form may be solved by Linear Programming methods. It is 
a special case of the Transportation Problem discussed by Hitchcock (1941) and Vajda (1955) 
with transport charges either zero or unity. The solution is obtained here by a different 
method. 


2. SOLUTION 


It is obvious that m, the minimum content of 21, cannot be less than the minimum content 
of a region entirely contained in R 1, in particular, it cannot be less than that of a rectangle 
contained in R 1. Consider the minimum contents of all such rectangles and find the greatest, 
say m’, then as (3) 


It will be proved that the equality sign holds. 
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The minimum content of a region equals the minimum content of the enclosed rectangle 
of greatest minimum content. In this connexion the word rectangle stands not only for 
a compact rectangle, but for any area which can be formed into a compact rectangle by 
a rearrangement of rows and columns. Such a rearrangement has no effect on the problem, 
at least in the finite form. The usefulness of this result lies in the fact that the minimum 
content of a rectangle can be immediately written down, and that for simple forms of R 1 the 
rectangle of greatest minimum content can easily be discovered. It is not necessary to find 
even one of the distributions of mass which actually satisfy the minimum condition. 


3. PRoors 


The proofs which follow are complete for the finite form of the problem, both for unrestricted 
and for integral values of the cell contents. The infinite form has been proved subject to 
a restriction which involves both the nature of the area R1 and the marginal distributions 
A(x) and B(y). Some developments of the problem are discussed: they include the minimum 
content of an area contained in R1 when the content of #1 is maintained at its minimum. 


(i) Minimum content of a rectangle 


Consider Fig. 1. Vertical lines are denoted by L, horizontal lines by M, and rectangular 
areas by the lines which bound them. Region 1, whose minimum is required, is the rectangle 
[,L,.M,M, in the top left-hand corner. Suppose that the cell at P in R1 is not empty and 
that a cellin Q in L, LM, M is also not empty. Then without violating the marginal conditions 
we can transfer some mass from P to P’ provided we transfer an equal amount from Q to Q’. 
This move reduces the content of Rl. The process can be continued until either 21 or 
L, LM, M becomes empty. If 1 empties, its minimum content is clearly zero. If L, LM, M 
empties, the content of L,L.M,M, is N — A(x) that of L,L,M,M is N—B(y), and conse- 
quently the content of R1 is A(x)+ B(y)—N. It is easily seen that this is the least possible 
content of R1; any smaller content requires a negative content in the diagonally opposite 
rectangle L, LM, M. The result may be summed up as follows. If 


A(z)+ Bly) -N>0 (4a) 


the minimum content is A(x) + B(y) — N, and the necessary and sufficient condition that the 
content should have this minimum value is that the diagonally opposite rectangle L, LM, M 


should be empty. If A(x) + Bly) —N <0 (46) 


then the minimum content is zero. 


(ii) Finite problem, general R.1, cell contents not restricted to integral values 


Any row or column which is completely empty may be disregarded without affecting the 
problem. It is convenient to assume that this has been done, so that no row or column is 
completely empty. It is convenient also to refer to the processes we employ in these proofs in 
an easily picturable way. Therefore, although it is essential to the present proof that we do 
not restrict ourselves to integral values of the cell contents, we visualize the problem as one 
of arranging on the rectangle an immensely large number of little balls each representing an 
equal almost infinitesimal mass. We shall refer to an element of #1 as a 1-space, of R2 as 
a 2-space. In like manner a ball in a 1-space is a 1-ball, in a 2-space a 2-ball. Although we use 
the language of the integral problem, referring to individual balls for example, we do not 
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assume any definite number of balls and may change the number in the course of the 
argument if necessary. 

Suppose first that a solution of the minimum problem has been found. Certain spaces 
contain balls, certain spaces are empty. If another solution of the problem is possible, 
a superposition of the two solutions is also a solution (the number of balls is different of 
course in the three solutions, but as already explained this is irrelevant). The new solution 
has the property that any cell which contained a ball in either of the first two solutions must 
contain a ball in the new solution. By superposing all possible solutions we therefore obtain 
a solution in which any space which can contain a ball does contain one. This is the solution 
with which we conduct the argument. 
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Fig. 1. Rectangular region 1. 


Suppose now that a column contains a 1-ball, as for example at P in Fig. 1, where L, and 
M, are now to be ignored. Suppose also that the cell at Q’ in this column is a 2-space. Then 
the row which contains Q’ cannot contain a 1-ball. For suppose there were a 1-ball at Q, then 
we might move the ball at Q to Q’, and the ball at P to the corresponding position P’. The 
totals in the rows and columns are unchanged so that the boundary conditions are still 
satisfied. But in the move from Q to Q’ a 1-ball has become a 2-ball. The move from P to P’ 
cannot reverse this change and may even increase it, therefore the minimum content of 
Region 1 has been reduced. Since this is impossible, there cannot have been a 1-ball in the 
row containing Q’. 

We now carry out a rearrangement of the rows and columns. All rows which contain 
a 1-ball are collected at the top; all columns containing a 1-ball at the left-hand side. In 
Fig. 2 this rearrangement defines the lines L, and M,, any row above M, and any column to 
the left of L, contains a 1-ball. Because of the nature of the solution with which we are 
working (every space filled which can be) there is no solution which has a 1-ball outside the 
rectangle L,L,M,M,. It is easily seen that L,Z,M,M, consists entirely of 1-space. If any 
elementary rectangle within it were a 2-space we should have the situation which in the 
previous paragraph was proved impossible: a 2-space whose intersecting row and column 
both contain a 1-ball. 
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To indicate in Fig. 2 that every row within a certain area contains a ball, we mark in the 
rows and put a dot, representing a ball, at the end of each row. The rectangle L, L, M, M, has 
both rows and columns marked in this way because both rows and columns contain a 1-ball. 
To indicate in a similar way the presence of an elementary 1-space or 2-space we use the 
same convention but replace the dot by a small circle. 

We next carry out a rearrangement of the rows below MV, and the columns to the right of 
L,. We collect at the bottom those rows which have a 2-space to the left of L,, thus obtaining 
the dividing line M,. Every row in L,L,M, M therefore contains a 2-space, as shown. In 
asimilar way we obtain L,, so that each column in L, LM, M, contains a 2-space. The regions 
[,L,M,M, and L,L,.M,M, consist, like L,L,M,M,, entirely of 1-space since not to do so 
would contradict the principle by which L, and M, were defined. Since these regions are 
entirely 1-space, yet lie outside the region occupied by 1-balls, they must also be empty. 
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Fig. 2. Solution, finite form. 


Consider now the rectangle L, LM, M, which will be proved empty. Suppose that it is not 
empty and that element A contains a 2-ball. Then since A lies below Mj, there is a 2-space at 
some space B in the same row as A and to the left of Z,. Again, since every column in 
IL, L,M,.M, contains a 1-ball there must be one at some space C. If we move the ball from A to 
Band compensate by a move from C to D we satisfy the boundary conditions. But the arrival 
of a ball at D is impossible. If D is a 1-space we have a solution of the problem with a 1-ball 
outside the rectangle L, L,.M)M, within which all 1-balls must be found. If D is a 2-space we 
have reduced the minimum content of Region 1. Since both these results are impossible the 
rectangle L, 1. M,M must be empty. Similarly L,1.M,M is empty. The empty regions are 
shaded diagonally in Fig. 2. 

Consider now the rectangle L, L,M,M,. It consists entirely of 1-space. Its content is the 
minimum content of Region 1, since it contains all the 1-balls and nothing else. The 
diagonally opposite rectangle L,LM,M is empty and this implies that the content of 
[,L,.M,M, is at its minimum. Thus m, the minimum content of Region 1, equals the mini- 
mum content of i,L,M,M,. If there were another rectangle lying entirely within Region 1, 
whose minimum content m’ exceeded that of I, L,.M,M,, we should have 


m'>m. (5) 








388 _ A maximum-minimum problem 


But since this rectangle lies within Region 1 it cannot have a minimum exceeding that of 
Region 1, hence ‘di? te (6) 
Since (5) and (6) are incompatible there is no such rectangle whose minimum m’ exceeds m; 
in other words L, L, My M, is a rectangle of greatest minimum content in Region 1. We may 
prove exactly the same thing of L, l,M,™,. 

We have therefore shown that the minimum content of Region 1 equals the minimum 
content of a rectangle which lies entirely within Region 1 and has the greatest minimum 
content of all such rectangles. 

When there is a unique rectangle of greatest minimum content, L, and L, coincide and so 
do M, and M,. Since we are assuming that there are no completely empty rows or columns it 
is easily seen that this is the only possible arrangement compatible with the existence of 
a unique rectangle. 
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Fig. ©. Solution for monotonic curves. 


It should perhaps be added that the proof is valid only when the minimum content of 
Region 1 is different from zero. But if the minimum content is zero, then the minimum 
content of any rectangle lying within Region 1 must also be zero. The theorem is therefore 
true generally. 

When the division into Regions 1 and 2 is simple, it is usually easy to find the rectangle of 
greatest minimum content. Fig. 3 shows the simplest case: R 1 is above a monotonic curve 
DEE'F. If we assume that R1 must be empty to the right of Z,, but that the column 
immediately to the left of L, contains a 1-ball, it follows that L, LM, M and all R21 below YU, 
are empty spaces. Thus L, Ll, MM, is a rectangle of greatest minimum content, and it has 
a corner on the boundary curve. If DEE’F is represented by y = ¢(x) the content of such 
a rectangle is A(x) + B(¢(x))—N, and m is the maximum of this function of x. For more 
complicated boundaries the rectangle may exist in separated pieces, but its content may be 
expressed as a function of x in a similar fashion. 


(iii) Finite problem, general R 1, cell contents restricted to integral values 


The proof employs a process of rearrangement somewhat similar to that of the previous 
proof but more elaborate. We now have only N balls. If we have two solutions, the super- 
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position of these clearly does not produce a solution of the problem of N balls. It may not 
even be a solution of the problem of 2N balls allowing half integral values. 

We assume in this case that we have found one integral solution of the problem. In Fig. 4 
we collect as before into the regions MM, and L, L, those rows and columns which contain 
a l-ball. It is clear, as before, that LL, MM, must consist entirely of 1-space, and that any 
balls outside this space at present must be 2-balls. Within L, LZ, M, M, each row or column 
contains somewhere a |-ball. This is indicated by the same convention as in Fig. 2. 
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Fig. 4. Integral values only. 

























































































We now rearrange the remaining rows and columns. Consider the columns J, £,. All rows 
containing a 2-space within L, L, M, M are moved to the bottom of the diagram, thus forming 
the group M, M. Inasimilar way we form the group L, L. As indicated in the figure each row 
in L, L,.M, M contains a 2-space, and each column in L, iM, M,. The rectangles L, L,M, UM, 
and L, L, M,M, consist entirely of 1-space and are therefore empty. We may now prove that 
L, LM, M is empty. 

We show first that it is possible to rearrange the balls in L, L, M,M, so as to obtain a ball in 
any desired cell. If the cell is empty the row and column which pass through it must each 
contain a 1-ball in some other cell. It is possible to remove these two balls and to replace 
them by one in the desired cell and one in the compensating position. All the cells concerned 
lie inside L, L, M, M,, and all are 1-spaces so that no conditions are broken by such a move. 
Thus we can obtain a ball in any desired position within L, L,M,M,. 

Suppose that L,L.M, M is not empty but contains a 2-ball at A. Then there must be 
2-spaces at B and C in the same row and column as A. D is the space which completes the 
rectangle ABCD. We arrange to have a ball at D. The balls at A and D may now be replaced 
by balls at B and C, the net result of the move being a reduction in the 1-balls which are 
already at their minimum. This is impossible, so that L, 1M, M must be empty. 

Consider next the remaining space of the rows M, M, i.e. L, L,M,M. Columns which are 
empty in this rectangle we move to the right to join up with the empty rectangle L, DM, M. 








390 A maximum-minimum problem 


The dividing line between empty and filled columns is L,. The corresponding dividing line 
for L, LM, M, is M,. We may now prove that L, I, M, M, consists entirely of 1-space. 

Suppose the cell at P were a 2-space. The row containing P must contain a 2-ball within 
LD, LM, M,; let this be at Q. This in turn is below a 2-space at say R, which again is level with 
a 1-ball at some point S in L, 1, M,M,. We move the ball from Q to R and compensate from 
S to 7’. Since 7 is in 1-space the ball at 7’ remains a 1-ball. In a similar way, using the moves 
Q’ to R’ and S’ to 7” we get another 1-ball at 7’. But the balls at 7’ and T” may be replaced 
by a 2-ball at P and a 1-ball in the compensating position. By this last move the minimum 
is reduced. This is impossible, therefore P cannot be a 2-space. The only case in which this 
argument breaks down is that in which S’ coincides with S. When a ball has been moved 
from S to T there may not in this case be another one to move to 7”. But, instead of moving 
a second ball from S to 7” it is permissible to move the ball at 7' to P, and to obtain as before 
the impossible reduction of the minimum. 

We now proceed as before; collecting between M, and MV, those rows which have a 2-space 
within L,L,M,M,. In a similar way we define L,. As before it follows that L, L,M,M, and 
L,I, M,M, are entirely 1-space and therefore empty. The rectangle L,L,M,M, may be 
proved empty, and then L; and M; are found as before by collecting the filled columns of 
LL, M,M, to the left, and the filled rows of L, L,.M,M, to the top. This process is continued 
until it comes to a stop. At each stage it is possible to prove that the newly found diagonal 
rectangle is entirely 1-space if it occurs towards the top left-hand corner of the diagram, or 
empty if it occurs towards the bottom right-hand corner. As an example we now prove that 
L,L,M,M, is empty. 

If there is a 2-ball at some point H, then it must be possible to move it up to a 2-space 
at F’, and to find another 2-ball at G which may be moved up to H, and a third at J which 
may be moved up to K. This multiple move may be compensated by moving a 1-ball from 
some point U to V. Now level with V there is the 2-space F’, below which is a 2-ball at G’. 
We move this to H’, then the ball from J’ to K’, and compensate the combined move by 
moving a 1-ball from U’ to V’. As there are now 1-balls at V and at V’ we can replace them 
by a 2-ball at F’ and a 1-ball at the appropriate point in L,L,M,M,. In this last move we 
reduce the minimum, and this is impossible. As before, the argument breaks down if K’ lies 
below U, but in this case the impossible move is to move V directly to F’. The rectangle 
L,1,M,M, is therefore empty. 

The process which we have described cannot be continued indefinitely since some rows or 
columns are used up at each step and the number available is finite. The end is reached when 
one of the processes of separation (empty strips from filled, or strips containing 2-space from 
those which are entirely 1-space) breaks down because all the remaining strips are of one 
kind. In Fig. 4 the end is reached when no columns containing 2-balls are found in L; L, M,.M, 
or LL, M; M,, which are therefore both empty. Since no complete row or column of the figure 
is empty the rectangle L;l,M,;M, must contain a 2-ball in every row or column, as shown. 

The final result in Fig. 4 is that all the 1-balls are contained in L,L;M,M,, and also in 
L,L,M,M;, and that both these rectangles are entirely 1-space, and are rectangles of 
minimum content since the diagonally opposite rectangles of L;LM,M and L,LM,M are 
empty. We have therefore obtained asolution of the problem in integers. It is clearly the same 
solution as when the cell contents are not restricted to integers, since the minimum content 
of the rectangle of greatest minimum content is fixed by the boundary conditions alone, 
and does not depend on whether we specify integral or non-integral solutions. 
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The process by which Fig. 4 is built up can end in a number of different ways. An examina- 
tion of the various possibilities shows that when it becomes impossible to subdivide the 
remaining sets of rows or columns the 1-balls are always found to be contained in a rectangle 
of greatest minimum content consisting entirely of 1-space. The theorem is therefore true in 
all cases. 

(iv) Infinite problem, general R1 

The attempt to prove the theorem when the rows and columns are not given, and the 
rectangle may be divided up arbitrarily, leads to difficulties connected with the nature of the 
division of the rectangle into Regions 1 and 2. These difficulties may be avoided by restricting 
ourselves to divisions which satisfy the condition explained below. 

If we divide the figure into an arbitrary number of rows and columns each elementary cell 
so produced must be assignable to one of three groups: Group A, those cells consisting 
entirely of 1-space except perhaps for points on their boundaries; Group B, those consisting 
partly of 1-space and partly of 2-space; Group C, those consisting entirely of 2-space, except 
perhaps for points on their boundaries. The area formed by the cells of Group A we shall call 
Region 1’, and the area formed by Groups A and B together we shall call Region 1”. These 
regions are groups of cells as in the finite problem. 

We now assume that the minimum contents of the rectangles of greatest minimum content 
for Regions 1’ and 1” differ by a quantity e¢ which can be made as small as we please by 
increasing the number of rows and columns. It seems likely that in most practical cases, 
where the subdivision is by means of one or more simple curves, and A(x) and B(y) possess 
finite derivatives at every point, it would be possible to show that this condition was fulfilled. 
If the derivatives exist, no line parallel to a side of the rectangle can carry a finite quantity 
of mass, and any finite number of such lines may be added to or deleted from an area without 
effect. 

As a first step in proving the infinite case we divide up our rectangle into a finite number of 
rows and columns as just described and consider the minimum contents of the Regions 1’ 
and 1”. Since the rows and columns are now given and the regions are of the correct type we 
can construct equivalent finite problems by replacing the infinite boundary conditions by 
the finite number of conditions which specify the contents of these rows and columns. We 
know that we can find solutions of the equivalent problems. We have still to prove, however, 
that the contents of the rows and columns can be so arranged within the rows and columns 
themselves that the same solution is also a solution in the infinite case, i.e. is valid for 
a subdivision into any other set of rows and columns. In a subdivision like that which gives 
Region 1’, where each cell belongs completely to one region or the other, there is no difficulty 
in rearranging the content of an elementary rectangle in any way we please. It is easily seen 
that in such a case the contents of the rows and columns may be rearranged so as to satisfy 

the infinite boundary conditions also. Therefore the minimum content found for Region 1’ or 
1” by treating it as a problem in finite form is also the minimum content for Region 1’ or 1” 
for the problem in infinite form. 

Let the minimum content of Region 1 subject to the infinite boundary conditions be m. 
There must exist some rectangle R, contained in Region 1, whose minimum content mp 
exceeds or equals that of any other rectangle contained in Region 1. Now the rectangles of 
greatest minimum content for Regions 1’ and 1” differ in content by less than some 
quantity ¢. Let these rectangles be denoted by R’ and R” and their minimum contents by 
m' and m”. 
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Now, apart from points confined to boundaries of rows or columns, Region 1’ is contained 
in Region 1, which is contained in Region 1”; therefore their minimum contents under the 
infinite conditions are in ascending order, 


m <m<m". (7) 


But since Region 1’ is contained in Region 1 the rectangle of greatest minimum content in 
Region 1’ cannot have a greater content than the corresponding rectangle of Region 1, and 
this in turn cannot have a greater content than Region | itself, therefore 


m' <Mpe<m. (8) 
Remembering that m” exceeds m’ by less than ¢ we may write (7) and (8) as 
m' <Mp<cm<m' +e. (9) 


Since mp and m are quantities independent of the subdivisions which give m’ and ¢, and 
since € can be made as small as we please, we must have 


Np =m (10) 


which proves that, in this case also, the minimum content of Region 1 equals that of the 
rectangle of greatest minimum content within it. 


(v) Minimum within a minimum, finite non-integral problem 


Suppose that after the rectangle has been divided into Regions 1 and 2 we subdivide 
Region 1 into Regions 1, and 1,, and require the minimum for Region 1, subject to the 
boundary conditions and to the extra condition that the content of Region 1 is maintained 
at its minimum. For brevity we shall refer to this as the conditional minimum for Region 1,. 
We assume that the subdivision is such that any cell of Region 1 belongs completely either 
to Region 1, or to Region 1,. 

In Fig. 5 the rows and columns have first been arranged as in Fig. 2 in finding the minimum 
of Region 1; the dividing lines for this arrangement being lettered as before L,, L,, M, and M,. 
Regions shaded at 45° from top right to bottom left are empty because of the minimum 
content of Region 1. The step curves X Y and WZ, which form part of the boundary of this 
shaded region, will be explained below. There may, of course, be other empty regions which 
cannot be shaded because their positions are unknown; elements of Region 1 which lie 
outside the rectangle L, L, M, M, are empty and there may be some of these to the right of XY 
and WZ. 

It is convenient in stating the proof to re-define Region 1,, to confine it to those parts of 
Region 1, which lie within the rectangle L, L, M, M,. This makes no difference to the answer 
we shall get for the minimum content, since any part of Region 1, which lies outside the 
rectangle L, L,M,M, must be empty, and the minimum content of Region 1, must therefore 
lie entirely within rectangle L, L,M,M,. We therefore assume that all the cells of Region 1, 
lie within L, L,M,M, and that all the rest of Region 1 is assigned to Region 1,. 

As before we assume that we have a solution of the problem and that by means of super- 
position any element which can contain a ball is made to do so. We now group the rows and 
columns of L,L,M,M, as we did for the first minimum. Rows and columns containing 
a 1,-ball are collected into M,M, and L,L,. Rows M,M, are those which have a 1,-space 
within L,Z,M,M,; columns L,L, similarly have a 1,-space in L,L,M,M,. We require also 
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a grouping of the rows and columns which pass through the rectangle L,,M,M,. Those 
which are empty within this rectangle are grouped below M, and to the right of L;. It cannot 
be assumed that every row and column in L, LZ, M, M, is filled, because, although we assume 
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Fig. 6. Formation of step curve. 


each complete row or column to be filled, there are other spaces available besides those 
which lie within the rectangle L,L,M,M,. 

We now rearrange the rows M, M in the following way, illustrated separately in Fig. 6. 
Consider the first column on the left. If any elements of it are 2-space we move to the bottom 
all the rows which contain these 2-spaces. The column will then be divided at A into an upper 
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pert consisting entirely of 1,-space and a lower part consisting entirely of 2-space. We then 
repeat the process with the second column and the rows above A, that is, we ignore the rows 
already brought down. The process is continued in this way until we reach L, in Fig. 5. It is 
clear that the process produces a step curve ABCDEFG in which the elements forming the 
rising face of each step are entirely 2-space while the space above the step curve is entirely 
1,-space. As we have pointed out earlier the space below the step curve is not necessarily 
2-space. g 

It is to be noted that if we select any group of columns, say those between C and G in 
Fig. 6, and change the order of the columns we shall get a different step curve. Nevertheless, 
it will begin at C and end at G as before. Every row which makes up the vertical height 
between C and G contains a 2-space, every row above G is entirely 1,-space. No rearrange- 
ment of the columns between C and G can alter this fact so that in any arrangement the 
steps must rise from C to G within this group of columns. 

In Fig. 5 the step curves X Y and WZ are constructed in this way. Since every row below 
M, contains by definition at least one 2-space to the left of Z,, the X Y curve reaches up to 
M, at some point to the left of L,. Similarly W is a point on L, above M,. We now see why 
the spaces to the left of X Y and WZ may be shaded to indicate that they are empty because 
of the first minimum: they are 1-space and outside the rectangle L, L, M,M,. 

Where L,, L,and L, intersect the step curve X Y we draw the horizontal lines M,, M, and M,. 
Strictly speaking, it is the intersection of the column immediately to the left of the Z line 
which determines the intersection; e.g. where L, falls exactly on the edge of a step we must 
take the curve on the left to determine M,. The lines M,, M, and M, determine L,, L, and L, 
in a similar way. 

We may now show that certain areas in Fig. 5 are empty. We have already noted that 
rectangle L, L, MM, has a 1,-ball in every row and column, and that rectangle L, L; M,M, 
has a 1,-ball in every row and column. Rectangle L, L, M, M, has a 1,-space in each row, and 
DL, LM) My, « 1,-space in each column. These are all indicated by the conventions of Fig. 2. 
The presence of 2-spaces in the risers of the step curves is also indicated in a similar fashion. 

By arguments similar to those used for the first minimum it follows that L,l,M,M, and 
LLM, M, consist entirely of 1,-space, and therefore apart from L, L, M, M, they are empty. 
The rectangles L, lL, M,M, and L,L,M,M, are empty because if filled they would enable us 
to perform impossible operations. Rectangles L,L,M, M and L,L.M,.M, are empty, because 
a ball in the first of these may be moved sideways to a 2-space under the step curve and 
compensated by removing a 1,-ball from L, LZ, M,M, to the right. This is the same sort of 
move as that which showed that L, Z, M,M, was empty. These spaces which have been shown 
to be empty by exactly the arguments used in finding the first minimum are shaded in Fig. 5 
at 45° from top left to bottom right. 

Now consider L, L, M, M,. If this is not empty let there be a 2-ball at A. This may be moved 
sideways to a 2-space at B, and there must be a 1,-ball above it at C which may be moved 
to D. The ball at D, whatever its nature, may be moved upwards to a 1,-space at #, and 
a 1,-ball at / moved down to G, remaining a 1,-ball in the process. The net result of this set 
of moves is that both minima are unaffected, but we have a 1,-ball at G in a row which by 
definition it cannot occupy. Therefore, the move is impossible, and L,2,M,M, must be 
empty. Similarly L, LM, M, is empty. 

It would be convenient to be able to prove that L;L,M, M, or L,L,M,M, was empty in 
order to get a completely empty rectangle with an upper left-hand corner on the step curve. 
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We cannot do this because there are no 1,-balls above the step curve between L,; and L,. We 
shall show in the next few paragraphs, however, that, if neither of these rectangles is empty, 
it is possible to rearrange the columnr L; 2, and rows M,M, and to find a line J, lying 
between L; and L, whose intersection with the rearranged step curve defines a line Mj, for 
which the rectangle L,Z.M,, M is empty. A similar process applied to the rows M; M, and 
columns L,I, gives us the lines M, and L,,) and an empty rectangle L,,LM, M. The last 
elements of these empty regions are shaded steeply from top left to bottom right in Fig. 5. 





















































tp : lia 442 " 
F 
Ms 
My ' 
rm if a oo 
Me | Ll 
QV 





Fig. 7. Detail from Fig. 5. 


Fig. 7 shows an enlarged version of the region between L, and L,. In L,L;M,M, every row 
contains a 2-space which is indicated by the fact that the step curve rises to M, within 
the rectangle. We next inquire whether the lower right-hand rectangle defined by L; M, is 
empty or not. In Fig. 7 we have assumed that it is not empty, therefore some of the columns 
in L;L,M,M, must be filled. We move these filled columns to the left thus defining their 
boundary line L,,. The remaining rectangle L,, L, M, M, is therefore empty. We next inquire 
whether there are any rows containing 2-spaces in L,L,, M,M,. If there are none we shall 
have achieved our goal, since the step curve will then proceed horizontally along M, to its 
intersection with L,,, and we shall have an empty bottom right-hand rectangle L,, LM,M 
with its corner on the step curve. If, however, there are 2-spaces in L; L,, M,M, we collect 
them at the bottom, thus defining their upper boundary M,,. Between L, and L,, the step 
curve must therefore rise to M,,. 

We may now prove that L,L,M,,M, is empty. If there were a 2-ball at A it might be 
moved sideways to a 2-space at B, below which there is a 2-ball at C which may be moved 
to a 2-space at D. The compensating move, a 1,-ball from £ to F, is exactly like the move 
C' to Din Fig. 5, and may similarly be shown to lead to an impossibility if continued upwards. 
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We then inquire whether the rectangle L,, ,M,, VM, is empty. [fit is we have reached our 
goal, but if it is not we collect the filled columns to the left as before and define the line L,,. 
From this point on we repeat the process. At each stage we inquire whether the rows above 
contain only 1-space or whether the columns on the right are empty. If the answer is 
affirmative we have reached our goal, if negative, we can reduce the remaining number of 
rows or columns. The process must come to an end because the number of rows and columns 
is finite. It can end in one of three ways, (1) in giving the sought for result, (2) because all the 
remaining rows up to M, contain 2-space, (3) because all the remaining columns up to L, 
contain 2-balls. If (2) or (3) occurs we can prove a contradiction. 

Fig. 7 illustrates case (2). All rows in L,,l,.M,M,, contain 2-spaces. By the type of 
argument just applied to the ABCD move we can show that L,I, M,M,, must be empty. 
But this contradicts our initial assumption that L, LZ, M,M, is not empty. 

If case (3) occurs it may be shown that the next step results in case (2). It may be seen from 
Fig. 7 that if when collecting the filled columns in the rectangle L,, L,M,, M, every column 
has been found to contain a 2-ball the next step would have been case (2), since the rows 
containing 2-spaces (or in other words the step curve) must reach up to MV, before L, is 
reached; see step curve in Fig. 5, 

Thus, we may conclude that either L;l,M,M, or L,L,M,M, is empty or the process 
described above must at some stage produce the lines L, and M,, (Fig. 5) which intersect on 
the step curve and have an empty lower right quadrant. Similarly, there exist the lines M, 
and L,, with the corresponding properties for step curve WZ. 

Consider now certain areas of Fig. 5. These are indicated by the lines A,, A,, A, A, drawn 
below and alongside the rectangle. The rectangles which are opposite these lines, that is, 
Ly Lg My My, Ly L3M,My, LeLyy)M)M, and L,L,).M,M,, contain all the 1,-balls and consist 
either of i,-space or of space which is empty because of the first minimum (space shaded at 
45° from top right to bottom left in Fig. 5). The space opposite the gaps in thelines, L, L, M,M,, 
L,1,M,M, LyLM,M, and L,,LM,M are all empty. If the space opposite the lines is 
rearranged as a compact rectangle, the space opposite the gaps becomes the diagonally 
opposite rectangle. Thus the content of the space opposite the A lines is at its minimum. We 
have therefore shown that the conditional minimum content of Region 1, (i.e. subject to 
existence of minimum for Region 1) equals the unconditional minimum content of a rectangle 
which lies within a larger region obtained by adding to Region 1, the space which is neces- 
sarily empty because of the minimum of Region 1. The same thing is true of the rectangle 
defined by the B lines in Fig. 5. 

Let the conditional minimum of Region 1, be denoted by m. Let the rectangle just dis- 
covered be denoted by R; its unconditional minimum is also m. Let the enlarged region be 
denotea by Region (1, +0) and let its unconditional minimum content be m’. 

Since Region (1,+0) is empty apart from Region 1,, its content is also the content of 
Region !, which is m. Since no content can be less than the minimum content m’ we have 


m>m', (11) 
But m is ais» the minimum content of rectangle R which lies entirely within the Region 
(1,+ 0) and cannot therefore exceed the minimum for that region, therefore 


m<m’, (12) 
Thus finally m=m'. (13) 


It is also evident that R must be a rectangle of greatest minimum content for Region (1, + 0). 
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Thus, we have shown that the conditional minimum of Region 1, is the unconditional 
minimum content of the enlarged Region (1, + 0). 


(vi) Minimum within minimum : finite integral problem 


A proof can be constructed by combining the procedures of Figs. 4 and 5. 


(vii) Minimum within minimum: infinite problem 


No proof on the lines of that given for a single minimum has been found. It seems likely, 
however, that the theorem is true for most simple subdivisions of the rectangle. 


(viii) Hatension to a third minimum 


When the division into regions is by means of simple and similar monotonic curves as in 
Fig. 3, some of the difficulties of Fig. 7 do not arise. In Fig. 3 Region 1 is all the space above 
DEF; Region 1, is the space above GHJ, and the third minimum region is the space above 
NPQ. In this simple case it may be shown that the minimum for the space above NPQ, 
subject to the two preceding minima, is given by a rule similar to that found for two minima. 
In this case empty space to be added to form the enlarged region is that which is necessarily 
empty because of the first two minima. This suggests that the theorem may be generalized to 
still further minima. It may well be valid for subdivisions other than those by similar 
monotonic curves. 

(ix) Further generalizations 

It might be expected that similar theorems would hold in more than two dimensions, the 
rectangle of greatest minimum content being replaced by an n-dimensional rectangular 
parallelepiped. But this is not so. The minimum content of a region cannot be less than that 
of a rectangular parallelepiped contained in it, but in some three-dimensional cases it has 
been shown to be greater. 


4. EXAMPLES 


(i) Uniform distributions on a square 


An extremely simple case illustrates a peculiarity of the infinite problem: the appearance of 
line densities on the boundary of Region 1. Consider a unit square whose top left-hand 
diagonal half is Region 1. The distributions are uniform, i.e. A(x) = x, B(y) = y, and the 
diagonal is x+y = 1. The minimum content of a rectangle with its corner on the diagonal is 
zero, so that the minimum content is zero. It is easily seen that, if the square is divided into 
rows and columns, the entire mass must be distributed uniformly in the elementary half- 
squares just below the diagonal. In the infinite case this becomes a line density along the 
diagonal, but from our point of view it must be considered as lying in Region 2. By con- 
sidering a parallel line just outside the diagonal boundary we may see that the content of 
Region 1 may be made as small as we please without the formation of a line density on the 
diagonal. 
(ii) Normal probability distribution, minimum within a circle 

In a normal standardized bivariate probability distribution in x and y, centred on 

«= y = 0, the probability in any strip of width dz is given by 
1 


= —— —hx? x ] 
dp Ten)° da (14) 
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and similarly for dy. Let us suppose that the boundary conditions for our problem are in fact 
given by (14), that is, the distribution is at least compatible with the normal distribution. If 
the distribution were in fact normal we know that the probability p in a circle of radius r with 
centre at the origin would be given by 


p= 1-—e*", (15) 
We then inquire what is the minimum probability within the circle of radius r which is com- 
patible with (14). 
The rectangle of greatest minimum content is a square inscribed in the circle. If the 
halfside of the square is denoted by d, the minimum content m of the square is given by 


2 d 2 
m - [=I et dz —1. (16) 
7 J -—d 


Both m and p are given in Table 1 for a range of radii. It may be noted that the change over 
from zero content to positive content takes place where 


r = 0-954, (17) 
i.e. a circle whose radius is almost equal to the standard deviation can be empty without 
making it impossible to satisfy the boundary conditions appropriate to a normal distribution. 


Minimum within minimum 


Only a single case has been worked out. The first minimum is for a circle of radius 2 and the 
second for the concentric circle of radius 1. It was found that the conditional minimum of 
the inner circle was 0-233, whereas the unconditional minimum was only 0-041. 


Table 1. Minimum content m of a circle of radius r compared with its content p 
in a normal distribution 




















0-2 | 0-0 | 

| 04 | 00 0-077 | 14 | 0-356 , 0-625 | 24 | 0-821 | 0-944 
| 06 | 0-0 0-165 | 16 | 0-484 | 0-722 | 26 | 0-868 | 0-966 
| 08 | 0-0 0-274 | 18 | 0594 | 0-802 | 28 | 0-905 | 0-980 
| 10 | 0-041) 0-393 | 20 | 0-685 0-865 | 30 | 0-932 | 0-989 
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FURTHER CONTRIBUTIONS TO MULTIVARIATE 
CONFIDENCE BOUNDS* 


By 8S. N. ROY anp R. GNANADESIKAN7 
Institute of Statistics, University of North Carolina 


SUMMARY. In this paper the implications of certain results obtained in earlier papers (Roy & Bose, 
19536; Roy, 1954a,b, 1956) on confidence bounds on parametric functions connected with multi- 
variate normal populations are fully worked out. This leads to a number of confidence bounds, 
expected to be useful, but hitherto unnoticed, on the characteristic roots connected with (i) one popula- 
tion dispersion matrix, (ii) two population dispersion matrices, (iii) the regression matrix of a p set 
on a q set, and (iv) the multivariate linear hypothesis on means, including, in particular, the problem 
of discriminant analysis. Some examples are given in the last section of the paper to illustrate the use 
of the techniques presented in the earlier sections. 


1. CONFIDENCE BOUNDS ON ROOTS CONNECTED WITH ONE DISPERSION MATRIX 


In conformity with the notation of previous papers, we consider a p-dimensional normal 
variate with dispersion matrix & and mean vector &, N(€, Xx). The characteristic roots of & 
are represented by @. The dispersion matrix in a sample of size (n + 1) is written S and its 
characteristic roots represented by 0. The smallest and largest roots we denote respectively 
by 6, and @,,, adjoining a suffix « to values between which they lie with probability 1 —«, 


writing for example 
P(O4, <9, <9, < , |) = 1-a. 


With a slight change in notation, let us denote the characteristic roots of a matrix M by c(M) 
and the smallest and largest roots respectively by c,,,;,, (JZ) and ¢max(M). 
The statement (3-1-2) given in Roy (1954a) is exactly equivalent to 


nO;,) (p,n) a’Sa>a'La > nOz(p,n)a'Sa... (1-1) 
for all non-null a(p x 1)’s, that is, to 


a’Sa_a’xa a’Sa 
car Fa PA 

aa” a’a a’a 

where A, and A, stand respectively for nO;,1(p, ) and n03,\(p, n). 
Choosing a so as to minimize a’La/(a’a), we observe that the second part of the inequality 

(1-2) implies that A, 0, < ©,; and choosing a so as to minimize a’Sa/(a’a), we notice that the 
first part of the inequality implies that 0, <A,0,. Likewise, choosing a so as to maximize 
a’'Sa/(a’a), we note that the second part of the inequality implies that A, 0, < @,,; and choosing 
aso as to maximize a’Za/(a’a), we have that (1-2) implies that ©, < A,9,. Thus (1-2) implies 
the inequalities 
ee ee (1-3) 
Ay Cmax (S) > Cmax (2) 2 AzCmax (S) 


We note that (1-3) has a confidence coefficient > 1—«. 


* This research was jointly sponsored by the United States Air Force through the Office of Scientific 
Research of the Air Research and Development Command, and the Research Techniques Unit, London 
School of Economics and Political Science. 

+t Now with the Procter and Gamble Company. 
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Going back to (1-2) let us take a(p x 1) such that the ith component is zero. Then arguing 
in a similar manner, we observe that (1-2) implies 


Ay Cmin (8S) 2 min (Z) > Agemin si 


1-4 
AsCmax (S) > Cmax (2) > AzCmax (S) (4) 


fori = 1,2,...,p, where S®, 5 stand respectively for the ‘truncated’ sample and population 
dispersion matrices obtained by cutting out the ith variate. Likewise, if we take an 
a(p x 1) such that the ith and jth (¢+j) components are zero and then argue in a similar 
manner, we observe that (1-2) also implies 


Ay min (S ») 2 Cmin (26?) 2 AsCmin (SG »),) 
ay Cmax (S@?)> Cmax (x D) > AeCmax (SG nf 


for i+) = 1,2,...,p, where S@1, 5%, stand respectively, for the truncated sample and 
population dispersion matrices obtained by cutting out the ith and jth variates. We can con- 
tinue this process on to the stage of cutting out any (p— 1) variates, that is, retaining any one 
variate. It is seen that (1-2) implies a pair of statements (1-3), and also p pairs of statements 
like (1-4), (?) pairs of statements like (1-5), and so on down to Ae 1 
involving only one variate. All such statements will thus have a joint confidence coefficient 
> 1—«a, and will provide us, from a certain standpoint, with a complete analysis of what the 
psychologists call the problem of principal components. 


(1:5) 


), i.e. p statements 


2. CONFIDENCE BOUNDS ON ROOTS CONNECTED WITH 
TWO POPULATION DISPERSION MATRICES 


Similarly, if we have two p dimensional normal variates with mean vectors &, and &, and 
dispersion matrices X, and &,, we may write equation (3-2-1) of Roy (1954a), putting 
Ay = (Ny/Mq) A7'(P, My, Ny) ANA Ag = (%4/Ny) O3,'(P, Ny, Ng), aS 
A, > all c(S,(u’)*D,,, w’Sp Dy, w) > Ag, (2-1) 
where y,’s are c(X, Xz "*)’s (with ¢ = 1, 2, ...,p). We next recall that 
all non-zero c(A(p x q) B(q x p)) = the non-zero c(B(q x p) A(p x q)) 


and that c(S,Sy1) are invariant under a transformation: S, = AS? A’ and S, = AS} 4’, 
where A is any non-singular matrix. Putting A = ~~}, we rewrite (2-1), without any loss of 
generality, for our purpose, in the canonical form 








or A, > all c(S,8748,D/, ST'Dyy) >Az 
or Ay aS; ia > a'(S,Dyy, Sr Dy) a - 2 08,55 %0 ‘ (2-3) 
aa aa aa 


for all non-null a(p x 1)’s. Now choosing a so as to maximize the middle term of (2-3), we note 
that the left part of the inequality (2-3) implies that A, Cyax (S, S21) > Cmax (S;Dy,,Sr1* Dyy): 
and choosing a so as to minimize the middle term of (2-3), we note that the right part of (2-3) 
implies that c,,;,(S,Dy,,Si*Dy,,) > AsCmin (Sy S21). Thus (2-3) implies 


Ay Cmax (S,S2*) > Cmax (S, Dyy, Sy*D,,,) 2 Cin (S; Dyy St*Dyy,) > Asin (S, S37). (2-4) 
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It has been shown in Roy (1956) that 
Cmax (S,Dy,,Sy*Dy,,) > all ¢(D,,) > Cmin (81 Dyy,Si*Dyy,). (2-5) 
Thus it is seen that (2-3) implies 
Ay Cmax (S,Sg*) > alll e(2, Bz) > Agemin(S, Sz"), (2-6) 
which, therefore, is a confidence statement with a confidence coefficient > 1—«, since (2-3) 
has the confidence coefficient 1—«. (2-6) is derived in a slightly different way in Roy (1956). 
We now go back to (2-3) and, as in the previous section, take a(p x 1) such that the ith 
component is zero, argue the same way as from (2-3) to (2-6) and end up by observing that 
(2-3) also implies 


ArCmax(S}SP"*) > all e(ZPUP™*) > AgCmin( SSP”), (2-7) 


min 
where S{?, SY, 2 and X@ have the same meaning as in the previous section. Likewise, as in 
the previous section, we note that (2-3) also implies 

AsCman(SHPSG-") > all (BEI) > Agen SPE), (2-8) 


and so on till we reach the stage where any (p— 1) variates have been cut out, i.e. any one 
variate has been retained, which gives us just the confidence bounds on variance ratios in 
the univariate case. We have thus, with a joint confidence coefficient > 1—«, a confidence 


statement (2-6), » confidence statements like (2-7), (3) confidence statements like (2-8), and 


so on. This again, from a certain standpoint, provides part of the analysis of a problem which 
occurs in the multivariate generalization of the customary variance components analysis in 
univariate analysis of variance and covariance. 


3. CONFIDENCE BOUNDS ON ROOTS CONNECTED WITH THE REGRESSION MATRIX /(p x q) 
OF A p SET ON A q SET IN A (p+q)-VARIATE NORMAL DISTRIBUTION 


Let the population be denoted 


bl ~~ ‘fs as 
Nj aLé GLXi2 Le 
1 p 4 


so that the p set has mean vector &, and the dispersion matrix &,, and the q set has mean 
vector €, and dispersion matrix X.. L,,. is the matrix of covariances between the p set and 
the q set and we define {(p x q) = 2,2. Let the sample dispersion matrix based on 


S 
a sample of size (n+ 1) be written S = 6) oI and let B(p x q) = S,_.Sq! be the sample 


Sia Soe 
regression matrix of the p set on the q set. Also, let us write 
S,.2(p x p) = Sy —S,2S39'S8i2. 
Setting, A? = 0,/(1—0,) we can now rewrite (4:5) of Roy (1954a) as 
all c[(B—) (B’ — ’)] <A? emax(S1-2) Cmax(S99") (3-1) 


with a confidence coefficient >1—a. We recall the Lemmas C''and £! of Roy (1954a), 
viz. that (i) the statement ‘g, < all c(M)<g,(for a p xp real matrix M with real roots)’ is 
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equivalent to the statement ‘g,<d’(1 xp) M(p~x p)d(px 1)<g, (for all arbitrary unit 
vectors d)’; and that (ii) the statement ‘x’(1 x q) x(q x 1)<h(>0)’ is equivalent to the 
statement ‘(x’(1 x q)d(q x 1)) </A(forallarbitrary unit vectors d)’. Using these two lemmas, 
we obtain from (3-1) the equivalent set of confidence statements, 


d; Bd, — Achax(Sy-2) chiax(Sz3) < dj fd, <d;Bd,+ Achax(S;.2) cb ax (S354), (3-2) 


for all unit vectors d,(p x 1) and d,(q x 1), with a confidence coefficient > 1—a. Going back 
to the above lemmas again we notice that, with respect to variation over d, and d,, the 
maximum values of d; Bd, and dj fd, are respectively ct.~(BB’) and c?,,.(f’). Now, first 
choosing d, and d, so as to maximize dj Bd, and then choosing d, and d, so as to maximize 
d} fd, and arguing in the same way as in the previous sections we note that (3-2) implies 


chiax(BB’) —Achiax(S4.2) Chax(Sz3") < Chax(PB") < Gax(BBirax) + Achax(S;-2) rax(Sz4), (3°3) 


which, therefore, is a confidence statement with a confidence coefficient >1—a. We now 
rewrite (4-4) of Roy (1954a) in the equivalent form 


a (B- A)S 22( a= —fp')a <2 
a’S,.a Aw s 





(3-4) 


for all non-null a(p x 1)’s, which is a confidence statement with a confidence coefficient 1 —a. 
This means that (3-4), with a probability 1 — a, implies (3-3), with a probability >1—a. As 
in the previous sections, take a(p x 1) such that the ith component is zero, define S®, B® and 
© as the ‘truncated’ matrices obtained by cutting out the ith variate of the p set, and 
observe that (3-4) also implies 


Chaax(BOBM) — Acinax(S2) Chaax(Si") < Grax(BR™) 
Bete hi ax(S$'2) Chaax (S22). (35) 
Likewise, as in the previous sections, we observe that (3-4) also implies 
Chnax( BOP BEM) — cha (SEP) Chax(S23") < Choax( BO) PEM) 
< Chax(BEOBEM) + Achax(S4?) chax(Sz*), (3:6) 


and so on. We have thus, with a joint confidence coefficient >1—«, the statement (3-3), 


2 
generalized by truncating the variates of the q set as well, but this will not be discussed here. 


p statements like (3-5), - ) statements like (3-6) and so on. This kind of result can be 


4, CONFIDENCE BOUNDS ON ROOTS CONNECTED WITH 
MULTIVARIATE LINEAR HYPOTHESIS ON MEANS 


4a. On& of N(E, Xx). We have a random sample of size (n + 1) and the sample mean vector is 
written as X while the sample dispersion matrix is S. Setting A? = T?/n + 1, where 7? is the 
upper a % point of Hotelling’s 7? distribution with p and n+ 1—p degrees of freedom, we 
can rewrite (4-1-4) of Roy & Bose (19535) as 
'x ‘Sa)t 'Sa)t 
i ~o< ae _ X , (a'Sa) (4-1) 
(a’a)t ~ (a’a)h (a a)is (a a (a’a)t 
for all non-null a(p x 1)’s. We recall that (4-1) is a confidence statement with a confidence 
coefficient 1—a. We recall also that the maximum values of a’X/(a’a)?, a’€/(a‘a)}, 
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(a’Sa)?/(a’a)?, with respect to variation over a’s, are respectively (X’X)!, (€’E)! and c,,..(8), 
then we reason in the same way as in the previous sections and deduce that (4-1) implies 


[®’X]}} — Achax(S) < [BE]? < [XX] + Ach..(S), (4-2) 


which is thus a confidence statement with a confidence coefficient >1—a. Arguing as in 
previous sections and using the same notation as before for ‘truncated’ X, § and S obtained 
by cutting out the ith variate, the ith and jth variates (i +7), and so on, we have with a joint 
confidence coefficient > 1—«, in addition to (4-2), p statements like 


| [HOR] — Achax(S) < [EOE < [KORO + Achy. (S), (4:3) 
(3) statements like 
[x@ NRG DE — Ac}, ax(S&D) < [EGDEGITE < [KG XG + Act, ax(S@), (4-4) 


and so on down to the stage of cutting out any (p— 1) variates, i.e. retaining any one variate. 


4b. Some observations on multivariate linear hypothesis on means. Confidence bounds 
connected with univariate and multivariate linear hypothesis on means are discussed 
respectively in Chapters 15 and 16 of Roy (19545). In this section we shall first set up 
a physically more general hypothesis and then discuss the associated confidence bounds. 

Let X(n x p) consist of n row vectors x;(1 x p)(withi = 1, 2,...,) which are independently 
distributed, such that x; is N[H(x;), &] and let H(X)(n x p) = A(n x m) £(m x p), where m<n 
and rank (A) = r<m. Let A,(n xr) be a basis of A and let us write (as we can, without any 
loss of generality) A(nxm)=[A, A, ]n and let us rewrite the expectation condition as 


r m—r 
E(X)n=n[A, Ay, ]f& |r ; 
p Ky ee - +o 
Pp 


Here the X is a set of (observable) stochastic variates, £ is a set of unknown population 
parameters, A is a known matrix of constants given by the design of the experiment and is 
called the design matrix. It might consist of numbers like, say 0, 1, etc. and/or a set of 
observed (non-stochastic) quantities, as in the case of regression problems with concomitant 
variates. The population dispersion matrix = is also unknown. This is the model under which 
we propose to test the hypothesis 





ts lo Cys | lesion x al manne et (+6) 
ro om—r 


where C(q x m), partitioned as above, and M(p x u) are matrices given by the hypothesis to 
be tested and are called the hypothesis matrices. The 6 on the right side of (4-6) stands for 
aq x uw matrix whose elements are all equal to zero. It is assumed that rank (M) = w<pand 
rank (C) = s<r(<m<nof course), and furthermore that, row-wise, [C,, C9] is a basis of 
C and, columnwise, AG is also a basis of C. We recall that for H, to be ‘testable’ (i.e. the 
21 
set of bilinear functions CéM of the unknown parameters £ are such that there exist unbiased 
estimates, which are bilinear functions of the observations, for each of them) we should have 


Oy, = 0,(A,A,)14, A, (with i = 1,2). (4-7) 
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However, in most realistic problems, the C matrix of the hypothesis is given in a form such 
that the last rows are absent and we can, therefore, without any essential loss of generality, 
replace (4-6) by 


SC, Cy If &lrxp) bs . 
r m= lester py Me x™ = 0. “ 
and (4-7) by O, = 0,(AiA,)“14) Ap. (4-9) 


We now go back to X and observe that X(n x p) M(p x u) [= X*(n x u), say] consists of 
n rows of independently distributed vectors x;(1 x p) M(p x u) [= x?"(1 x wu), say] such that 
x} is N[E(x}), M’XM], i.e. N[E(x*), &*], say, and that 


* 
Boxe) =(4, 4 [2] ar=(4, 4a[ A], (4-10 
say, where 
r[éy r 4 ; 
m—rlel Me") rll ais, 
p u 
The H, of (4-8) can now be rewritten as 
MH: §C, CO, ] Ai =0 7: 
r om-r E m—r wn 
u 
and the alternative H to H, can be expressed as 
H: sC, C, ] Ay = 9*(8 xu). (4-13) 
r m—r |E¥|m—-r 
u 


We next recall (16-6-3) of Roy (19545), viz., 
a’X'A,(A‘ A,)—“10}, 0b — (a’Sa)? [sc,(p,8,n—r)]# <a'y’Ob 
<a’ X’A,(A‘A,)—1C,, Ob + (a’Sa)} [sc,(p, 8,n—r)]}#, (4:14) 
for all non-null a(p x 1)’s and all unit vectors b(s x 1), and substitute M’X’,i.e. X*’ for X’,C, 
for C,,, a*(u x 1) for a(p x 1), u for p and 9* for y. We then see that the confidence statement 
(4-14) is replaced by 
a*’X*’A,(A{A,)C; Ob — (a*’S*a*)t [sc,(u, 8, n—r)]* 
< a7" Ob<a®X*A,(Aj Ay)10; Ob + (a®'S*a*)} [s0,(u,8,n—r)]}t, (415) 


* 
for all non-null a*(u x 1) and all unit vectors b(s x 1), where 7* is given by (4°13), 4 by 


& 
(4:11), X* stands for X M, and where 
O00’ = [(0,(A}A,)7204]7 (4-16) 
and 
(n—r) S*(uxu) = M'(ux p) X'(p x n) [I (n) — A,(Aj Ay) 1 Aj X(n x p) M(p x u). 
(4:17) 
We note that (4-15) is a set of confidence statements, with a joint confidence coefficient 


1 —«, on bilinear compounds of 7*, where 7*, defined in (4-13), may be regarded as measuring 
the deviation from the null hypothesis H). 
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4c. Further consequences of (4-15). Starting from (4-15) and arguing in the same manner as 
in §3 and setting c,(u,s,n—r) = c, (say), we note that (4-15) implies 


ChaxlX*’A,(A{A,)C; 00'C,(Aj Ay) 1A, X*] — [80,]# cha (S*) < chasiy* (0 0’) 9*] 
< chax[X*’A,(AjA,)1C,00'C,(Aj Ay) 14; X*] + [80,] hax(S*), (4°18) 


or substituting for 00’ from (4-16), 


haax(8S**) — [822]! Chaax(S*) < Choaxl* [Cy(43 41) Oi}? 9") < hax 68**) + [804] chax(S*), 
(4:19) 
where the matrix due to the hypothesis, i.e. sS** is given by 


sS** = M’X'A,(A, A.) C1[0,(4, Ay) C1]}10,(4, 4,7 AX, (4-20) 


and the matrix due to the error, i.e. (n — r)S* is given by (4-17). Notice that (4-19) is a confi- 
dence statement with a confidence coefficient > 1—a« and that the middle term of (4-19) is 
zero if, and only if, the hypothesis H is true. For p = 1, M(p x u) will drop out (except for 
a trivial scalar factor, since u < p) and we shall have the univariate problem where c,,,,(8S**) 
will be replaced by just the sum of squares due to the hypothesis, c,,,,(S*) by just the error 
mean sum of squares and 


Cmaxl7* [C(A, 41)" Ci] *9*] by just the scalar *'(C,(A,A,)* Cj} *y*. 


Starting from (4-15) and reasoning in exactly the same way as in §§3 and 4a we see that 
(4:15) also implies, in addition to (4:19), w statements like (4-19) involving ‘truncated’ 


S*@, S**© and 7*® obtained by cutting out any ith variate, (5) statements like (4-19) 


involving ‘truncated’ S*61), S**G) and »**¢) obtained by cutting out any pair of ith and 
jth variates (with i +7), and so on. These latter confidence statements will thus have a joint 
confidence coefficient > 1—«. 

It may be noted that the problem discussed in § 4a is really a special case of the one dis- 
cussed in §4c; nevertheless, for expository purposes, it is worthwhile to discuss first a simple 
problem like the one in §4a and then take up the most general one in § 4c. 


5. EXAMPLES TO ILLUSTRATE THE TECHNIQUES OF §§3 AND 4 


5-1. To illustrate §4 or, more properly speaking, the method developed in §4c for the 
general problem of multivariate analysis of variance of means, we use the data from a 
numerical example discussed in a standard textbook (Rao, 1952). The data consist of three 
kinds of physical measurements on 140 school children, in more or less the same age group, 
from six different high schools. To translate into the set-up of this paper, let €,(3 x 1), for 
i= 1,2,...,6, denote the population mean vectors and let 


| | 
g(6x3)=]i] . 
E41 
3 
Then, if X(n x 3), where n = total sample size = 140, be the observation matrix, we can 


biaaad E(X) = A(nx 6) £(6 x 3), (5-1-1) 
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where A(jnx6)=fl 000... %) M+... +N, = 140 
Ny . . . . . . . . 
rae 8 ee, « « @ 
ee -€ 8 i a Ve 
Ne . . . . . . . . 
ee te gt Te 
and 7(A) = 6. 
Also Hj: &, = ... = &g(—=)CE = 0(5 x 3), where 
Oye Ve Ses 
osx6)=|° } S uy 
eo ... £ =i 


so thatr(C) = 5. Furthermore H: Cé = (5 x 3). Hence we have p = 3,n = 140,r = 6,8 = 5. 
Next we take over from Rao’s book the ‘between’ product moment and ‘within’ product 
moment matrices given respectively by 


[752-0 214-2 521:3 

sS**(3 x 3) = 151-3 401-2 

1612-7 

and [12809-3 1003-7 2671-2 

(n—r) S*(3 x 3) = 1499-6 4123-6]. 

I 21009-6 

According to the set-up of this paper we are interested in obtaining, with a joint probability 
or confidence coefficient > 0-95, say, simultaneous confidence bounds on (i) a parametric 
function which is a measure of departure from H, (on all three variates), (ii) three parametric 
functions which are measures of departure from H, (on any two variates), and (iii) three 
parametric functions which are measures of departure from H, (on any one variate). Going 
back to the middle term of the inequality (4:19), we observe that here both A and C are of the 


full permissible rank, and calculate out [C(A’A)-!C’]-! and obtain the following parametric 
functions (on which we shall put confidence bounds): 





6 
For (i) chs [= 14h — Te) (ne —ned/n| =© (cay), 
j= 
6 6 
where Vi = > 1; 54,|% (k= 1,2,3) and [> (Nj — Nie) Ngee — mn) 


(k, k’ = 1,2,3) stands for a matrix whose k, k’ element is theexpressionwithin[ ]. Likewise 
for (ii) 6 
Chas | (Nie — De) (Ngee — Ne)! n| 
= © (say) when k,k’ = 2,3, 
= 0® (say) whenk,k’ = 1,3, 
and = © (say) when k,k’ = 1,2. 
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Finally, for (ii ‘ 
inally, for (iii) ch, ‘ [= n(n "tn) | 


= 0®3 (say) whenk = 1, 


= ©4% (say) when k = 2, 
and = 042 (say) when k = 3. 


Notice that for (i) we have to deal with a 3 x 3 matrix, for (ii), three 2 x 2 matrices, and for 
(iii), three scalars for which chaxl ]=[ |] itself. Notice also that where the null hypo- 
theses are not true, then on the six groups, © is the positive square root of the characteristic 
root of the population ‘between’ product moment matrix for all the three variates, next 
Q, O® and ©® are respectively the positive square roots of the characteristic roots of the 
population ‘between’ product moment matrices for the variates (2,3), (1,3) and (1, 2), and 
finally O?, O4-% and ©” are respectively the positive square roots of the population 









































Table 1-1 
| (1) | (2) (3) | (4) 
| ; 
(0) : | | ——— 
| } | 
(2) | (6) | (a) | (b) | (a) | (6) (a) (6) 
a oo = a4 | 
Matrix M | sS** | S* sS**@ | g*a) sS**@) S*@ sS**@ | §*@) 
| eb,.(M) | 44-40 | 12-93 41-42 | 12-77 43-11 12-78 28-65 9-81 | 
Use << ae he LL al 7 
| | 
(5) | (6) | (7) 
(0) fscmoin | dete te eco ie ss 
| (a) |  (b) | (a) (b) (a) ;  (d) 
| | 
I | |__| _____ aeanien Uneeeele 
Matrix M | 3S**,2) | 90,2) 3S**0,3) | §*0,3) 3S** 2,2) | S* (2,3) 
ch. (M) | 40-16 12-52 12:30 | 3-35 27-42 | 9-78 








= bi 
d 








‘between’ sum of squares for the variates 1, 2and 3. Up to this point the data and the analysis 
are entirely realistic. To set up the confidence bounds (4-19) and similar ones obtained by 
truncating the variates by ones, next by twos, we need c,, or, in other words, the 5 % points 
of the relevant statistic with D.F. p = 3, s = 5 and n—r = 134. The construction of such 
tables is under way and the percentage points are not available to us at the moment. From 
certain inequality relations discussed in (Roy, 1953a) we set very approximately 


C,(P, 8, —T) = Coo5(3, 5, 134) = 2-90, sothat sc,( )=14:5 and ([sc,]? = 3-81. 


In Table 1-2, the 1st column gives confidence bounds on @, the 2nd, 3rd and 4th on 0%, 
Q® and ©®, respectively, and finally the 5th, 6th, and 7th on 0%, @%% and O@%, respec- 
tively. The ©’s being intrinsically non-negative, negative lower bounds would merely 


———— 
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imply that the corresponding ©’s could come down to zero values. The joint probability or 
confidence coefficient >0-95. As stated just now, the analysis and the conclusions are 
rendered somewhat approximate for lack of tables of percentage points, which, if available, 
would have made the confidence statements entirely correct. 

It may be asked at this stage, what happens if we use Table 1-2 for testing of hypothesis. 
We observe that the procedure on testing of hypothesis, while related to, is not entirely 
identical with the procedure on confidence bounds. As discussed in (Roy, 1953) there is 
a procedure (which would be the right one) for testing H, with an exact preassigned prob- 
ability, and, at the 5% level, H, would be just rejected on that level, but would not be 
rejected on a level, say, a little less than 5°. This agrees broadly with the conclusion given 





















Table 1-2 
Variates 
cut out | None | Ist | 2nd | 3rd | Ist Ist 2nd 
| | | | _ and 2nd | and 3rd | and 3rd 
| | | | 
| | 
AD. FIER ONE Seb: RID ee oS 
| | | | 
Lower bounds | | | 
[=(a)—[sc,}(b)] | —4:86 | —7-23 —ses | -878 | -— i ~ 0-46 —~9-84 
Upper bounds | | | 
| [= (a) + [seq}4(0)] | 93-66 90-07 91:80 | 66:03 | 87-86 25-06 64-68 | 








in (Rao, 1952). However, if we use the above Table 1-2 on confidence bounds, for testing of 
hypothesis at a level < 5 % (and recall the approximation introduced by taking a rough value 
of the percentage point), then, since all the confidence intervals include zero (which would 
correspond to the null hypothesis), we would say that we should just about not reject the 
null hypothesis on all 3 variates and on variates (1, 2), (1, 3) and (2, 3) and also on variates (1) 
(2) and (3). 


5-2. To illustrate §3 we construct an artificial example by taking over S** and S* from the 
previous example and putting 


p=4q=3,n = 8 (sample size —1), B(3x 3) B'(3x 3) = S**(3 x 3), S,_.(3 x 3) = S*(3 x 3) 
and . 4 

S.(3x3)=]0 1 0} = 1(3), (say). 
0 0 1b 


Thus all c(S3') = 1. We have also, very approximately, 0,(p,q,) = o.95(3, 3, 8) = 0-67, so 
that A? = 0-67/0-33 = 2-03. Denoting by 4(3 x 3) the unknown population regression matrix 
of the p set on the q set, by #%(2 x 3), 8@(2 x 3) and £(2 x 3), respectively, the unknown 
population regression matrices of (2,3), (1,3) and (1,2) on the q set and by £@(1 x 3), 
£91 x 3) and £41 x 3), respectively, the unknown population zegression vectors of 
variates (1), (2) and (3) on the q set, we are interested in obtaining, with a joint probability 
> 0-95, confidence bounds on c},,,(£’) = (say), on c?,,, (2A) = OM(say, with i = 1, 2, 3) 
and on cb, (A986) = OG? (say with i+j = 1, 2,3). 

Each of these ©’s is a proper measure of departure from a corresponding null hypothesis. 
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We now have 




























































Table 2-1 
| | im 
| (1) (2) | (3) | (4) 
| (0) , hiss Sn | 
| | 
| (az) | (b) Un a ee w | @ (b) 
| | | | 
Cae be On lees he ROR eva Seat 
Matrix M BB’ fis | BOB, | sv, BeBe S2) BOB | Se), 
ch,.(M) 19-86 | 1293 | 1852 | 1277 | 1928 | 1278 | 1281 | 9-81 
pe Seve she. oad Sis ee Pare, ea 5 ie 
| } 
| (5) (6) (7) 
| (0) : i 
(a) | (b) @) | () | @ | (b) 
pana ames 2+ = a seconds Vaneaccan 
Matrix M Ba. sa2 | Bas | Sa.a) Bes) | Se.) 
| x Bu2) | x Bas’ | x Bes 
| Chac(M) 17-96 | 12-52 5-50 | 3:35 | 1226 | 978 | 
| | 
Table 2-2 
Pea! cael | | ci "ieee Wigs SU Ae 4 
| served i° | None 2nd 3rd Ist | Ist 2nd 
| gin biel hier and 2nd | and 3rd | and 3rd 
| | | 
| Bounds ‘oe | | 
| | 
| Lower [= (a) —A(b)] | 1450 | 039 113 | —1-12 | 0-18 | 0-74 | —1-63 
| Upper [= (a) +2(6)] | 38:22 | 36-66 | 37-43 | 26-74 | 35-74 | 10-26 | 26-15 
oe aa. nal | 











In Table 2-2, the 1st column gives confidence bounds on ©, the 2nd, 3rd and 4th, respec- 
tively, on 0, O® and O®, and the 5th, 6th and 7th respectively, on 0%”, O49 and O@%), all 
with a joint probability >0-95. As before, the ©’s being intrinsically non-negative, any 
negative lower bound would merely imply that the corresponding © could come down to 
zero. The conclusion would be rough on account of the broad approximation in the percentage 
point. 

Turning now to the problem of testing of the null hypothesis we recall that it is possible to 
do this at an exact preassigned level (Roy, 1953a). As observed on the previous example, we 
note also that the procedure of testing of hypothesis, while related to, is not entirely 
identical with that on confidence bounds. However, if we do use Table 2-2 for testing of 
hypothesis, then we first recall all the approximations involved and also that since 
Cnax(M) > Cna,(truncated M). We have now, necessarily, 


O>O0®, O®, OM; EMF O49, E42, E2>O2%, O42 and O%>O49, O29), 
(52-1) 
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We note next that the only intervals that include zero are the columns 4 and 7. Thus we 
should roughly conclude that (i) not all variates (1, 2, 3) are independent of the q set, (ii) none 
of the variates (3) or (2) is independent of the q set, while variate (1) is independent of the 
q set and (iii) not all variates of either (2, 3) or (1,3) or (1,2) are independent of the q set. 
The conclusion (i) follows from the 1st column, (ii) from columns (5), (6) and (7), while (iii) 
follows from columns (2), (3) and (4) and also the inequalities (5-2-1). 


Concluding remarks. We hope to be able to illustrate much better and with realistic 
examples and tables the methods of §§ 1, 2,3 and 4 when two-tailed tables needed for §§ 1 
and 2 and one-tailed tables needed for §§ 3 and 4 become available, as we expect them to be, 
in the near future. 


In conclusion, it is a great pleasure to thank the editor and the referees for their 
valuable suggestions for the improvement of the paper both in form and in content. 
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TABLES FOR USE IN ESTIMATING THE NORMAL DISTRIBUTION 
FUNCTION BY NORMIT ANALYSIS 


PART I. DESCRIPTION AND USE OF TABLES 


PART II. COMPARISON BETWEEN MINIMUM NORMIT X? ESTIMATE 
AND THE MAXIMUM LIKELIHOOD ESTIMATE 


By JOSEPH BERKSON 
Section of Biometry and Medical Statistics, Mayo Clinic, Rochester, Minnesota 


Part I. DESCRIPTION AND USE OF TABLES 


The cumulative normal distribution function has been used extensively in bio-assay and in 
other experiments in which P, the probability of some all-or-none ‘response’ is a monotonic 
function, increasing or decreasing, of a quantity x which measures the potency of the agent 
producing the response. The function may be written as in (1), with some auxiliary quantities 


defined by (2) and (3) 


a oo 1 Xi ' 
P; = 1-Q; = e935 e-tu du, (1) 
1 
=—— —Kaj— 2/02 
Z; o(2m)° > (2) 
Normit of P, = X; = (a,;-—)/o = a+ fa;. (3) 


P,, the probability of response at x; is given by (1) in terms of the integral of the standardized 
normal function N(0, 1). The normal frequency curve, N(, 7) with ordinate Z; given by (2) is 
sometimes taken to represent the distribution of hypothetical resistances of the individuals 
in the experimental population.* From (3),a = —/o, # = 1/0. In bio-assay problems, one 
is frequently interested in the value of the dosage x corresponding to a 50 % response; this is 
given by 25) = —a/f.t 

Assuming this model, and given a set of observations x;,n;,7;, where n; is the number of 
individuals ‘exposed’ at x;, and r; is the number responding out of the n,;, a method of esti- 
mating the parameters, which is part of a procedure named ‘normit analysis’, has been 
investigated by Berkson (1955a). The estimator provided by this method falls in the class 
R.B.A.N. of Neyman (1949) (regular best asymptotically normal) and therefore is asymp- 
totically equivalent to the maximum likelihood estimator. For finite samples, at least in 
a wide variety of conditions corresponding to situations met in practice, the variance and 
mean square error of this estimator are smaller than those of the maximum likelihood 
estimator. { 

With p; = 1—q; = r,/n,; representing the observed relative frequency of response at x;, and 
X, representing the normit corresponding to p; (observed normit or normit of p,), if (1) gives 


* This, however, is not a necessary part of the model. There are cases known in which such variable 
resistances may not even exist and still (1) may pretty well describe the observed phenomena. The 
model consists simply in the hypothesis that (1) evaluates the relevant probabilities, without implying 
necessarily any particular mechanism underlying them. 

+t Not to be confused with x; (i = 50). t See Part Ii. 
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the true probability of response at x,, the observed X ; plotted against x; should fall along the 
straight line X, = «+ fx,, subject to random variation of the X;. The estimate of normit 
analysis is obtained by minimizing 


x? (normit) = in,;w,(X,; — X,), (4) 


where X = A+ px; is the estimate of « + fx;, and w; = Z3/(p;q;) with Z; given by (2) with p, 
replacing P,. Since (4) is asymptotically distributed as y®, it is called the ‘normit x’, and the 
estimate of normit analysis is called the ‘minimum normit x? estimate’*. 

The minimization of (4) is obtained by accomplishing a weighted least squares fit of 
a straight line to the observations X,, the weight of X; being n;w,. The estimates of « and £, 
symbolized @ and B , are explicit functions of the observations x,;, X;, and are achieved simply 
and directly without use of iterative procedures, from the following equations: 








, - in, w(X;—X) (x;-Z) ee in, w;, X,x;— in, w, X;, Un, w,x;,/Un,w; 


in, w,(x;—%)? in, w, xj — (In, w;,x;)?/In,w; ' ©) 
@ = X— fz, (6) 
R50 = ~alp, (7) 
where X = In,w,X,/In,w, F = In,w;x,/In,w;. 


For large samples the following formulas may be used to provide ‘internal estimates’ of 
the standard errors of these estimates 


8*(X) = 1/n,w,, (8) 
s%(A) = 1/n,w,(x;—Z)*, (9) 
8%(&) = 6°(X)+2%s%(A), (10) 
°(B5o) = Ud) + (Bo —2)?5%(A)]. (11) 


When « only has to be estimated equation (6) gives this estimate on replacing B by its known 
value f. 

The calculations require for each observation the unit weight w, and also w,X;. These are 
provided in Tables 1 and 2 (pp. 414-19 below). For each value of n between 2 and 50 inclusive, 
Table 1 gives the required quantities, with r as argument. Tabling in this way makes it 
unnecessary to compute p = r/n and also provides w and wX with greater precision than 
usually would be obtained from the calculated value of p. For n > 50 it is necessary first to 
calculate p, and to use Table 2, which gives w and wX with p as argument. For p = 0 and 
p = 1, X is negatively and positively infinite and wX is zero. In such cases a working value 
is used taking p as 1/(2n) and 1 —1/(2n) for the case of p equal to 0 and 1, respectively.} 

The values of w and wX were calculated using the Statistical Tables of Kelley (1938), 
carrying the eight decimal places of those tables in the arithmetic computations; they are 
given in Tables 1 and 2 to five decimal places with the last digit correct to + 1. The tables 
were checked by recomputation of sections of them in a spot check, and then by calculation 
and inspection of first and second differences. An example of the calculation of the estimates 
is shown in Table 3. 


* A generalization of this class of estimates has since been formulated as ‘minimum transform x’ 
estimates’ (Berkson, 1954). + See Part II. 
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Table 3. Example of calculations; data from Fisher & Yates (1953) 

















x | n r w* | wx* | wx 
Pk | Bee ae Se a — 
3 | 8 0 0-25813 —0-39601 | 0-77439 
4 | 8 eo) 0-25813 — 0-39601 | -1-03252 
5 8 2 0-53857 — 0-36326 | 2-69285 
6 8 3 | 0-61350 | —0-19549 3-68100 
7 8 3 | 0-61350 —0-19549 4-29450 
8 8 7 | 0-38743 0-44569 | 3-09944 
9 8 8 0-25813 0-39601 | 2-32317 
| | 
Xw = 2-92739, “wa = 17-89787, UwX = —0-70456, 
x = x 
ga 611393, X= 2X ~ _ 0.24068, 
x<w xLw 
LXwa? = 117-76905 LwXxz = —0-00013 
(Xwa)?/Xw = 109-42640 LwX Lwa/Xiw = — 430763 
Swie—Z)?=Diff.= 8-34265 Lw(X — X) (w—z) = Diff. = 4-30750 
Lw(X —X) (w—Z) aes 
= ——__ "= 0-5163, 2=X- fz = —3-3973, 
B Sue— zy 05163, 2=X- pz 3 
Bey = —&/f = 6-5801. 
1 
s?(X) =——— = 0-02470 (s(X) = 0-16), 
niw 
1 
is i = 0-0149 = 0-12), 
0B) = Seams = 001408 (s() ) 
8*(&) = s*X + 7s%(f) = 0-6027 (s(&) = 0-78), 
1 


lg) = Bel) + (Bq Z)94(A)] = 01724 (old) = 0-42), 
X = 0-522 —3-40.t 
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Fig. 1. The data of example for which calculations are shown in Table 3, 
together with the line fitted by minimum normit y?. 


* From Table 1. Ifthe n, are not all equal as they are in the present example, these columns contain nw 
and nwX instead of w and wX. In case n;> 50, one must first compute p; = r,;/n;, and consult Table 2. 
+ In practical work, it is useful to plot the data and the estimated normit line. A number of graph 
papers have been published for plotting linearly the cumulative normal distribution function. A con- 
venient one for practical work, in respect to size and scale, is graph paper no. 32451, Normal ruling, Codex 
Book Company, Norwood, Mass., U.S.A. The present data are shown plotted on this graph paper in Fig. i. 











414 Tables for estimating normal distribution function by normit analysis 
Table 1. Normit weights 
Upper figure in table is w = Z?/pq; lower figure is wX 
For r <4n, wX is negative. For r> 4m, use n—r as argument, and wX is positive 
——— | j 7 
* | % “oa a “Ta hk 8 9 | 10 
NE | | c : > ee 
} | | | | | | n 
o | — _ | 053857 | 0-44951 | 0-38743 | 0-34222 | 0-30763 | 0-28031 | 0-25813 | 0-23974 | 0-22394 
-36326 | -43480 | -44569 | -43858 | -42551 | -41078 | -39601 | -38187 | -36834 : 
1 — | -63662 | -59490 | -53857 | -48987 | -44951 | -41588 | -38743 | -36317 | -34222 0 
| oe “25630 | -36326 | -41228 | -43480 | -44390 | -44569 | -44332 | -43858 
2 a? oe — | -63662 | -62192 | -59490 | -56612| -53857 | 51309 | -48987 , 
0 15756 | 25630 | -32042 | -36326 | -39240 | -41228 
3 is _ = re _ 63662 | -62917 | -61350 | -59490 | ‘57567 i. 
6 -11321 | -19549 | -25630 | -30188 | 
fee i! ‘ i ae As 63662 | 63211 | -62192 | ze 
| pong 08838 | “15756, 
| | | 
5 — | _ ~ — — — — | — | — | 0-63662 | | 4 
o | | 
| | 
| | | 
n= . = [| ee ae e | . 
= a ie eee A ee Se ah Se A | 6 
~ u | . 13 | “u | 1 16 17 is | 19 20 . 
r 
~~ & : | | — 
0 | 0-21059 | 0-19881 | 0-18850 | 017916 | 0-17090 | 016845 | 0-15689 | 0-15093 | 0-14520 | 0-14014 8 
| -35592 | -34420 | -33335 | -32302 | -31348 | -30458 | -29648 | -28890 | -28143 | -27466 
| 
1 | -32386| -30763 | -29324 | -28031 | -26880 | -25813 | -24841 | -23974 | -23138 | -22394 | 9 
| | 48243 | -42551 | -41823  -41078 | -40342 | -39601 | -38875 | -38187 | -37487| -36834 
| 2 | -46870| -44951 | -43186 | -41588 | -40099 | -38743 | -37477| -36317| -35241 | -34222 | 10 
| | 49582 | -43480 | -44062 | -44390 | -44547| -44569 | -44488 | -44332 | -44119 | -43858 
3 | +55672 | -53857 | 52138 | -50513 | -48987 | -47555 | -46214 | -44951 | -43758 | -42638 il 
| -33663 | -36326 | -38385 | -39986 | 41228 | -42188 | -42023 43480 | -43895 | 44191 | 
4 | -60899 | -59490 | -58048 | -56612 | -55214 | -53857 | -52557 | -51309 | -50119 | -48987 12 
| «21245 | -25630 | -29162 | -32042 | -34389 | -36326 | -37920| -39240 | -40330 | -41228 | 
| | | | | | 
5 | 063360 | 062645 | 0-61697 | 0-60623 | 0-59490 | 0-58337 | 0-57182 | 0-56049 | 0-54940 | 0-53857 | 13 
07242 | -13177 | 18103 | -22202 | -25630 | -28514 | -30961 | -33034 | *34805 | -36326 | | 
6 — 63662 | -63446 | 62917 | 62192 | -61350 | -60437 | -59490 | -58530 | -57567 | | 14 
0 | -06132 | -11321 | -15756 | -19549 | -22815 | -25630 | -28064 | -30188 | 
| | | 
7 ~ a — | -63662 | -63501 | -63092 | -62521 | -61843 | -61094 | -60306 | 15 
| 0 | -05307 | -09925 | -13937 | -17451 | -20533 | -23237 | | 
si —-ji--lf- — — | -63662 | -63536 | -63211 | 62751 | -62192 | = 
| | | 0 04687 | -08838 | -12492 | -15756 | 
ej —- | — _ a a ee — | -ese62 | “63561 | -63208 | 
| | 0 | -04193 | -07954 | 
10} — — | — — — ‘en Miia | — | — | 063662 
| | 0 
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JOSEPH BERKSON 415 
Table 1 (cont.) 
Upper figure in table is w = Z*/pq; Jower figure is wX 
For r< 4n, wX is negative. For r>4n, use n—r as argument, and wX is positive 
. 2 | 22 23 24 25 | 26 27 28 29 30 
P 
PS naa tee ; i TSA: ee ml £ 5g 
0 | 0-13537 | 0-13091 | 0-12679 | 0-12301 | 0-11961 | 0-11615 | 0-11308 | 0-11042 | 0-10727 | 0-10499 
| +26815 | -26194 | -25609 | -25064 | -24564 | 24050 | -23587 | -23179 | -22692 | -22335 
1 | -21689 | -21059 | -20445 | -19881 | -19338 | -18850 | -18354 | -17916 | -17506 | -17090 
-36190 | -35592 | -34990 | -34420 | -33855 | -33335 | -32791 | -32302 | -31833 | -31348 
2 -33268 | -32386 | -31564 | -30763 | -30029| -29324| -28673 | -28031 | -27449| -26880 
| -43561 | -43243 | 42910 | -42551 | -42193 | -41821 | -41458 | -41078 | -40714 | -40342 
3 | -41588 | -40589 | -39633 | -38743 | -37894 | -37090 | .36317 | -35579 | -34880 | -34222 
-44390 | 44511 | -44567 | -44569 | -44525 | -44445 | 44332 | -44193 | 44032 | -43858 
4 -47907 | 46870 | -45885 | -44951 | -44048 | -43186 | -42361 | -41588 | -40822 | -40099 
-41969 | -42582 | -43098 | -43480 | -43804 | -44062 | -44251 | -44390 | 44488 | -44547 
| | 
5 | 0-52812 | 051805 | 0-50829 | 0-49887 | 0-48987 | 0-48116 | 0-47281 | 0-46476 | 0-45693 | 0-44951 
87625 *38738 | -39698 | -40525 | -41228 | -41834 | -42350 | -42791 | -43168 | -43480 
6 | -56612 | -55672| -54757 | -53857 | -52984 | -52138 | -51309 | -50513 | -49738 | -48987 
| -82042 | -33663 | -35076 | -36326 | -37422 | -38385  -39240 | -39986 | -40647 | -41228 
| | | | 
7 | +59490 | -58669 | 57838 | 57022 | -56208 | -55406 | -54628 | -53857 | -53109 | -52372 | 
| +25630 | -27735 | -29617 | -31272 | -32760 | -34088 | -35263 | -36326 | -37272 | -38128 | 
8 | -61570 | -60899 | 60204 | -59490 | ‘58771 | -58048 | -57327 | -56612 | -55909 | -55214 | 
‘18647 | -21245 | -23556 | 25630 | -27487 | -29162 | 30674 | “32042-33270 | -34389 | 
9 | -62917 | -62450  -61921 | -61350 | -60748 | -60129 | -59490 | -58850 | +58206 | ‘57567 | 
| °11821 | -14355 | -17086 | 19549 | -21775 | -23787 | -25630 | 27294 | -28812 | "30188 | 
10 | 0-63580 | 0-63360 | 0-63041 | 0-62645 | 0-62192 | 0-61697 | 0-61173 | 0-60623 | 0-60062 | 0-59490 | 
03795 | -07242 | -10349 | -13177 | -15756 | -18103 | 20236 | -22202 | -23989  -25630 | 
ny; — -63662 | -63593 | -63409 | -63137| -62797 | -62403| -61973 | -61509 | -61026 | 
| 0 | -03461 | -06640 | -09532  -12181 | -14617 | -16842  -18903 | -20786 | 
12 = = cond 63662 | -63604 | -63446 | -63211 | -62917 | -62573 | -62192 | 
| 0 | -08190 | -06132 | -08838 | -11321 | -13627 | -15756 | 
13 = vo = = = -63662 | -63612 | -63476 | -63272 | -63012 | 
| 0 | 02951 -05688 | -08223 | -10585 | 
14 = caruite | ieee = | a= = = 63662 | -63619 | -63501 
| | 0 | -02744 | -05307 | 
“a a oe | — — | —_ | 063662 | 
| e 0 
7 eo! ee | : . | 
27 Biom. 44 


















































416 Tables for estimating normal distribution function by normit analysis 


Table 1 (cont.) 


Upper figure in table is w = Z*/pq; lower figure is wX 
For r<4n, wX is negative. For r>4n, use n—r as argument, and wX is positive 





37 | 38 | 39 


a 31 32 33 | 34 | 35 36 
; | | 
| a 





0-08981 | 0-08833 











| 
0-08634 


= Fecha | 
| | | 
0 | 0-10223 | 0-09990 0-09802 | 0-09564 | 0-09371 0-09177 | 
| -21897 *21523 *21219 -20830 -20514 | -20191 | -19862 -19612 *19272 
|} ] -16737 | -16381 | -16019 *15689 *15393 | -15093 | *14789 *14520 *14249 
| *30931 *30501 | -30058 | -29648 | +29274 | -28890 | +28496 *28143 | +27782 
| 2} +26326 | 25813 | -25318 | -24841 | -24384 | -23974 | -23559 | -23138 | -22768 
| +39964 | -39601 +39237 *38875 *388517 | -+38187 *37844 | -37487 *37166 
3 *33589 +32984 *32386 -31820 *31285 | -30763 | -30276 | -29781 *29324 
} -43666 -43463 *43243 -43018 -42789 *42551 *42317 -42066 | -41823 
4 | -39405 | -38743 | -38100 | -37477 | -36894 | -36317 | -35765 | -35241 | -34726 
*44572 -44569 44540 44488 | 44419 *44332 | +44231 -44119 | +43994 
| 5 | 0:44226 | 0-43535 | 0-42854 | 0-42214 | 0-41588 | 0-40977 | 0-40384 | 0-39827 | 0-39274 
*43745 -43961 44141 *44280 -44390 | 44472 | *44528 | 44560 | -44574 
| 6 *48254 *47555 -46870 -46214 | *45565 “44951 | *44348 | -43758 | -43186 
| -41743 *42188 *42582 +42923 *43225 *43480 -43703 | *43895 | *44062 
| 7 | +51660 | -50970 | -50286 | -49631 | -48987 | -48357 | -47755 | -47161 | -46587 


*38887 +39566 -40186 -40733 
54531 *53857 -53198 *52557 
+35402 -36326 +37164 | -37920 


9 -56928 -56301 -55672 
*31453 *32597 -33663 *34627 | + +35518 -36326 | -37077 | -37762 | *38385 


oo 


-51929 -51309 | -50708 | -50119 
-38608 | -39240 | -39810 


-49544 
-40330 | -40802 





o 
So 


0-56612 | 0-56049 | 0-55490 














| 
| 
-41228 -41674 | -42065 -42420 | +42733 


-55058 -54449 | .53857 | -53268 -52694 | -52138 


| 
10 | 0-58917 | 0-58337 | 0-57757 | 057182 | 0-54940 | 0-54392 | 
27128 | -28514 | -29701 | -30961 | -32042 | 33034 | -33055 | -34805 | -35509 
11 | -60522 | -60014 | -59490 -58966 | -58443 | -57919 | -57393 | -56873 | -56358 
| | 
| 29538 | -24133 | -25630 | -27003 | -28269 | -29444 | -30543 | -31556 | -32497 
| 12 | -61782 | -61350 | -60899 | -60437 | ‘59971 | -59490 | -59011 | -58530 | -58048 
17724 | - 21245 | -22815 | -24262 | -25630 | -26892 | -28064 | -29162 
| 17724 | -19549 | -21245 | 22815 | -24262 | -25630 | -26 28064 | 
| 13 | -62711 | -62370 | -62005 | -61620 | ‘61212 | -60795 | -60370 | -59932 | -59490 
12757 | -14802 | 16690 | -18436 | -20087 | -21614 | -23034 | -24377 | 25630 
14 | 63321 | -63092 | -62821 | -62521 | -62192 | -61843 | -61476 | -61094 | -60706 
07701 | 09925 | -12009 | -13937 | -15756 | ‘17451 | -19038 | -20533 |. -21923 
| | | 
15 | 0-63624 | 0-63520 | 0-63360 | 0-63157 | 0-62917 | 0-62645 | 0-62348 | 0-62030 | 0-61697 
02568 | -04973 | -07242 | -09343 | -11321 | -13177 | -14926 | -16568 | -18103 
| 16 — | 63662 | -63628 | -63536 | -63394 | -63211 | -62994 | -62751 | -62482 
| e. 02425 | -04687 | -06830  -08838 | -10726 | -12492 | -14169 
i aaa a — | -63662 | -63632 | -63550 | -63423 | -63259 | -63062 
| | 0 02281 | -04482 | -06450 | 08365 | -10175 
jis} — | — | — | = — | -63662 | -63635 | -63561 | -63446 
| | | | | 0 | 02154 | -04193 | -06132 
9} — | — _ —- | -|-}] = 63662  -63638 
| | 0 02042 | 
| | | | } 
20); — | _ | —{|/-—-|-— Sos F, } — | — 





40 


0-08483 
-19013 


14014 | 
-27466 | 
22394 | 
36834 


-28884 | 
-41579 | 


+34222 
bead 





0-38743 | 
-44569 | 
| 

-42638 | 
-44191 | 
46025 | 
-43014 | 
-48987 | 
41228 | 


-51583 | 
3896 | 


0°53857 | 
-36326 | 


-55843 | 
33381 | 
-57567 

30188 | 
59049 | 
26794 | 


60306 | 
23237 | 


0-61350 | 
*19549 | 
*62192 
-15756 
-62839 
-11884 
63298 | 
-07954 
-63571 
03986 


| 063662 
e 
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JOSEPH BERKSON 


Table 1 (cont.) 


. Upper figure in table is w = Z?/pq; lower figure is wX 
Yor r< 4n, wX is negative. For r>4n, use n—r as argument, and wX is positive 


417 





\ 





4 
| 


r\ 


or 














, - 
41 | 42 | 43 7 ee ee ee 50 
Ee TRE Te wer: eee Bi FP ree ae 
| | | | | nice T condi ha 
0-08330 | 0-08177 | 0-08022 | 0-07919 | 0-07762 | 0-07657 | 0-07498 | 0-07391 | 0-07283 | 0-07175 
-18750 | -18483 | -18212 | -18029 | -17751 | -17563 -17277 | -17084 | -16889 | -16692 
| | | 
-13776 | -13537 | -13335 | -13091 | -12886 | -12679 | -12512 | -12301 | -12132 | -11961 
27144 | -26815 | 26536 | -26194  -25904 | 25609-25369 | -25064 | -24816  -24564 
| j | | | 
22044 | -21689 | -21361 | -21059 | -20723 | -20445 | -20164 | -19881 | -19595 —-19338 
-36517 | -36190 | 35881 | -35592 | -35265 | -34990 | -34709 | -34420 | -34124 | -33855 
| | 
-28460 | -28031 | -27644 | -27252 | -26880  -26503 | -26148 | -25813 | -25475  -25160 
41335 | -41078 | -40838 | -40587 | -40342 | 40087 -39839 | -39601 | -39354 | -39118 
-33749 | -33268 | -32820 | -32386 | -31968 | -31564 | -31155 | -30763 | -30387 | -30029 
43716 | -43561  -43405 | -43243 | 43078 | -42010 | -42731 | -42551 | -42371 | -42193 
| | | | 
0-38237 | 0-37721 | 037249 | 0-36769 | 0-36317 | 0-35876 | 0-35448 | 0-35032 | 0-34611  0-34222 
“44548 | -44511 | -44464 | -44402 | 44332 | -44252 | -44165 | -44070 | -43964 | -43857 
-42096 | -41588 | -41070 | -40589 | -40099 | -39633 | -39192 | -38743 | -38305 | -37894 
44303 | -44390 | -44461 | -44511 | -44547 | -44567 | -44574 | -44569 | -44551 | -44525 
-45474 | -44951 | -44429 | -43924 | -43437.| -42955 | -42478 | -42022 | -41588 | -41147 
43264 | -43480 | -43675 | -43844 | -43989 | -44116 | -44226 | -44317 | -44390 | -44451 
-48437 | -47907 | -47376 | -46870 | -46376 | -45885 | -45409| -44951 | -44497 || -44048 
-41619 | -41969 | -42294 42582 | -42842 | -43080  -43292 | -43480  -43651 | -43804 
51041 | -50513 | -49993 | -49480 | -48987 | +48505  -48023 | -47555 | -47100 | -46649 
-39499 | -39986 | -40437 | -40853 | -41228 | -41572 -41895 | -42188 | .42454 | -42701 
} | | 
0-53330 | 0-52812 | 0-52307 | 051805  0-51309 | 0-50829 | 0-50359 | 0-49887 | 0-49436 | 0-48987 
“37001 | -27625 | -38200 | -38738 | -39240 | -39698 | -40123 | -40525 | -40887 | -41228 
‘55387 | -54837 | -54342 | -53857 | -53374 52903 | -52437 | -51996 | -51534 | -51091 
34196 | -34958 | -35668 — -36326 | -36946 | -37519 | -38055 | -38554 | -39016 “39452 
57089 | -56612 | -56143 | -55672 | -55214 | -54757| -54301  -53857| -53417 | -52984 
‘31143 | -32042 | -32873 | -33663  -34389 | -35076 | -35725 | -36326 | -36892 | -37422 
| | j | 
-58606 | -58157| -57713 -57275| -56819| -56394 | -55961 | -55528 | -55105| -54684 
| | | 
-27886 | 28920 | -29884 | -30779 | -31626 | -32434 «33183-33894 | -34556 | “35181 
59903 | -59490 | -59082 | -58669 | -58254  -57838 | -57431 | -57022 | -56612 | -56208 
24463 | -25630 | -26711 | -27735 | -28704 -29617 | -30465 | -31272 | -32042  -32760 
0-60994 | 0-60623 | 0-60250 | 0-59873 | 0-59490 | 0-59109 | 0-58720 | 0-58337 | 0-57950 | 0-57567 
20905 | -22202 | -23411 | -24549 | -25630 | -26641 | -27611 -28514 | 29377 -30188 
61885 | -61570 | -61239 | -60899 | -60558 | -60204 | 59849 | -59490 | -59131 | -58771 | 
17254-18647 | -19982 | -21245 | -22421 | -23556  -24620 | -25630 | -26585 | -27487 
62593 | -62331 | -62049 | -61759 | -61454 | -61142 | -60820 | -60496 | -60162 | -59829 
‘13503 | 15018 | -16476 | 17831 | -19128 | -20355 | -21525 | -22626 | -23686 | -24677 
63119 | -62917 | -62692 | -62450 | -62192 | 61921-61641 -61350 | -61050  -60748 | 
09689 | -11321 | -12882  -14355 | -15756 | -17086 | -18345  -19549 | -20697 | -21775 
63467 | -63332 | .63169 | -62982 | -62776 | -62552 | -62317 | -62064 | -61806 | -61534 
05831 | -07574 | -09232 | -10820 -12321 -13751 | -15095 | -16399 = -17618 | -18798 | 
| | | | 
063640 | 0-63580 | 0-63485 | 0-63360 | 0-63211 | 0-63041 | 0-62850  0-62645 | 0-62425 | 0-62192 | 
01946 | -03795 | -05561 | -07242 | -08838  -10349 | -11806 -13177 | -14494 | -15756 
— | .63662 | -63642 | -63587 | -63501 | -63386 | -63249 | -63092 | -62917 | -62725 
| 0 | -01851 | -03620 | -05307  -06925 | -08460 | -09925 | -11321 | -12664 
ae ee pe: Lie 63662 | -63644 | -63593 | -63514 | -63409 | -63283 63137 
| 0 | 01771 | -03461 | -05084 | -06640 | -08112  -09532 
tee (eee ie sas - -63662 | -63646 | -63599 | -63526 | -63429 
| | © | -01691 | -03317 | -04877 | -06370 
| | 
a | a p. —~ | — | — | ~ | .¢see2| -63647| -e3604 
| | 0 -01627 | -03190 
= il ~ i oe be ee | = ll — | 0-63662 
| | | st: 
| | | sae 
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Table 2. Normit weights 


Upper figure in table is w = Z*/pq; lower figure is wX 
For p < 0-50 on left, wX is negative. For p > 0-50 on right, wX is positive 





| Thousandths, for p on left 








| 0 | 1 | 2 | a ae 5 | 6 | 7 8 9 
| 
~~ | 


| | a 


0-00 —_— 0-01135 | 0-02014 | 0-02799 





0-03523 | 0-04203 | 0-04847 | 0-05463 | 0-06054 | 0-06624 | 0-07175 | 0-99 | 25 | ° 
— | -03507| -05796| -07690| -09343| -10825| -12177| -13424| -14584| -15670 “16602 | 
01 | 0-07175 | 07709} -08228| -08733 -09226| -09707| -10177| -10636| -11087 | -11528/ -11961, 98 | 7° | 

16692 “17657 | “18572 | 19443-20272) -21064 | -21823) -22550) -23248| -23019| -24564 | 
02 | -11961| -12386/ -12803| -13213/ -13617| -14014| -14404| -14789| -15168| 15541 -15010 | rn ea 
24564 | -25187| -25787| -26366| -26925| -27466| -27989| -28496| -28087| -29462| -20923 | | 
03 | -15910| -16273| -16631| -16984| -17333 17678 | -18018| -18354| -18686| -19014| -19338| 95 | 28 

-29923 | -30370| -30804| -31225| -31633| -32031 | -32416| -32791| -33156| -33511| -33855 | 
04 | 19338) -19659| -19976| -20289| -20600| -20906| -21210| -21510| -21808| -22102| -22394| .95 | 9 
33855 | -34191| -34517| -34835| 35144) -35445| -35738| -36023| -36300| -36571| -36834 











| | 30 | 0 
0-05 | 0-22394 | 0-22682 | 0-22968 | 0-23251  0-23531 | 0-23809 | 0-24084 | 0-24357 | 0-24627 | 0-24895 | 0-25160 | og | 930 | ° 
36834 | -37091 | -37340| -37584 | -37821| -38051| -38276 | -38495 | -38708| -38916| -39118 


| 
06 | -25160| -25423 | +25684 | -25942 | *26199 | -26453 | -26705| -26955| -27203| -27449| -27693 | 93 ‘a 
*39118 | -39315| -39507| -39694| -39875 | -40052| -40225| -40392| -40555| -40714/| -40868 | 


| | 29 
07 | -27693| -27935| -28175| -28413 -28649| -28884| -29116| -29347| -29576! -29804| -30029) -92 32 
40868 | -41019 -41165| -41307| -41445| -41579| -41709| -41836) -41958| -42078 | -42193 | | 
08 | -30029| -30253| -30476! -30697, -30916| -31134| -31350| -31564| -31777| -31989| -32199, 91 | 8 
42193 | -42306| -42415 | -42520| -42622| -42722  -42817| -42910| -43000| -43087| -43171 | 


09 | -32199| -32407, -32614| -32820| -33024| -33227/ -33429| -33629| -33828| -34026| -34222) 90 | 4 | 
43171 | 43251 | -43330 43405-43477 | -43547) -43614 | -43670| -43741 | 43800-43807 | 


| 
| | 0-34803 | 0-34994 | 0-35184 | 0-35373 | 035560 | 0-35747 | 0-35932 | 0-36116 | 0-89 | "35 0 
-43964 | -44013 | -44061| -44106 | pen -44189 | -44227| -44263| -44297. | 

















0-10 | 0-34222 | 0-34417 
-43857 | -43912 





| 034611 
| -36116| -36299| -36480 | -36661 | -36840 | -37019| -37196| -37372| -37547| -37721| -37804| -88 | 6) 
44297 | -44329| -44359 | -44386 | -44412| -44436| -44457| -44477| -44495| -44511| -44525 
-12 | -37894| -38066| -38237 -38407 -38576 | -38743/ -38910| -39076| -39241| -39405| -39568, 37 | °7 | 
44525 | 44037 44548 44556-44563 | -44569) 44572 -44574| -44574) -44572 | -44569 





























13 | -39568| -39730| -39891| -40051| -40210| -40369| -40526| -40682| -40838| -40993| -41147 86 “38 | 
44569 | -44564| -44558| -44550| -44540 44529 | -44517| -44502| -44487| -44470| -44451 
14 | -41147| -41299| -41452| -41603| -41753| -41903| -42051| -42199| -42346 "42492 | 42638 85 "39 
44451 | -44431 | -44410 | | “44303 | 44338 | 44311 | -44283| -44254 |) -44223 -44191 | 
“15 | 0-42638 | 0-42782 0-42926 | 0-43069 0-43211 | 0-43358 | 0-43493 | 0-43633 | 0-43772 0-43910 | 0-44048 | 0-84 nia 
| 44191 | -44158| -44123) -44088| -44051| -44012| -43973 43933 | -43891 | -43848 | -43804 | | 
16 | -44048| -44185| -44321| -44456 | -44591| -44725| -44858| -44990| -45122| -45253| -45383| -83 “al | 
43804 | -43759| 43713-43665 -43617 -43567 | -43516 | pesad -43412| -43358| -43303 | 
‘17 | -45383| -45513| -45642| -45770| -45898| -46025 | 146151 | -46276| -46401| -46525| -46649| -82 “42 | 
| 43303 | -43247| -43191  -43133 | "43074 | 43014 | “42953 | -42892 | -42829| -42765| -42701_ 
| +18 | -46649| -46772| -46894, -47016 | -47136| -47257| -47376| -47495| -47614| -47732| -47849, 81 “43 | 
| 42701 | -42635| -42569; -42502| -42433 | 42364 | 42294-42224) -42152) -42080| -42006 | 
‘19 | -47849| -47965| -48081 -48196 | -48311 | -48425  -48539 | -48652| -48764| -48876| -48987| 80 “44 | 
| 42006 | -41932) -41857 -41781 41705-41627 -41549| -41470| -41390| -41310| -41228 
-20 | 0-48987 | 0-49097 | 0-49207 | 0-49317 0-49425 0-49534 | 0-49641 | 0-49748 | 0-49855 | 0-49961 | 0-50066 | 0-79 | "#5 | % 
| 


41228 | “41146 | -41063 | -40980 | -40895 | -40810  -40725 | -40638| -40551| -40463) -40375 | 


| | | P 46 | 
‘21 | -50066 | -50171! -50275| -50379 | -50482, -50585| -50687 | -50789 | -50890| -50991| -51091| ‘78 - } 
-40375 | 40285 | 40195 | -40105, -40013| -39921 | -39829 -39735| -39642| +39547)| -39452 








22 | -51091| -51190| -51289| -51387| -51485| -51583| -51680| -51776| -51872| -51967| -52062| ‘77 i 
| 39452 | 39356 | pesend 39162-39065 -38966| -38868 | -38768| -38668| -38567| -38466 

‘23 | -52062 | -52157| -52250| -52344| -52437  -52529| -52621  -52712| -52803| -52894| -52984| “76 eal 
| 38466 | -38364 | -38262| -38159 | 38055 -37951 | -37846 37741-37636 | -37529| -37422 | 

24 | +52984| -53073| -53162| -53251| -53339| -53426| -53513| -53600| -53686| -53772| -53857| “75 ” 
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-37207 | -37099| -36990 | sua -36771 | 36660 | -36549 | -36438 | -36326 
- 
| 0). oe 


| 








| 
| 
| 








Thousandths, for p on right | 























Upper figure in table is w = 


Table 2 (cont.) 


Z*/pq; lower figure is wX 


For p < 0-50 on left, wX is negative. For p > 0-50 on right, wX is positive 





Thousandths, for p on left 











































































































| 0 1 2 | 3 4 5 6 } 7 8 9 | 
— +] eT 
0-25 | | 0-53857 | 0-53942 | 0-54026 | 0-54110 | 0-54193 | 0-54276 | 0-54359 | 0-54441 | 054523 | 0-54604 | 0-54684 | 0-74 
| *36326 -36214 36101 | ‘35987 | -35874 | -35759| -35645| -35529) -35414/ -35298| -35181 
26 | -54684 | -54765| -54845 |) -54924| -55003| -55081 | -55159 | -55237| -55314 | 55391 | -55468| -73 
| e) -35064 | -34946| -34829| -34710| -34591| -34472 panned 34233 | 34112 | -33991 
27 | +5468 | -55543| -55619| -55694| -55769| -55843| -55917| -55990| -56063) -56136| -56208| -72 
| 33991-33870 | -33748 | -33626 | -33504  -33381 | -33257| -33134 | -33010| -32885 | -32760 
28 | -56208 | -56280| -56351 | -56422 56493 | -56563  -56632 56702 | -56771 | -56839| -56907| -71 | 
“32760 | 32635 | -32510| 32384) -32257| -32131  -32003 -31876 | -31748 | -31620| -31492 
29 | -56907 | ‘56975 | -57042| -57109| -57176| -57242) -57308| -57373| -57438 | ‘57503 | -57567| +70 
| 81492) -31363) 31234) -31104 -30974 me | -30713 paiva, *30451 | -30320| -30188 
030 | 0-57567 | 0-57631 | 0-57694 | 0:57757 | 0-57820 | 0-57882 | 0-57944 | 0-58005 | 0-58066 | 0-58127 | 0-58188 | 0-69 
| ‘30188 | -30056 | -29923| -29791 29657 | -29524| -29390 pence -29122 28987 | -28852 
31 | -58188| -58248| -58307| -58366| -58425| -58484 | ‘58542 | -58600 | -58657 | 58714 | 58771 | -68 | 
28852 | ‘28717 | -28582|) -28446| -28310 | 28173 | -28037 | -27900 | -27762) -27625, -27487 | 
‘32 58771 | -58827| -58883| -58939| -58994 -59049 | -59103 |) -59158 59211 | 59265 | -59318| -67 
‘27487 | -27349 -27211| -27072) -26933| -26794  -26655| -26515 -26375 | 26235 | +26095 | 
33 -59318) -59871| -59423| -59475| -59527 | 59578 | -59629 | -59679 | -59730| -59780| -59829| -66 
-26095 | +25954| -25813| -25672| -25531| -25389| -25247| -25105| -24963| -24820| -24677 | 
34 | 59829 | 59878 | -59927 | -59976| -60024| -60072| -60119| -60166 | 60213 | “60260 | 60306 | -65 
| +2467 | sani -24391 24248 | -24104| -23960 | 23816 | -23671 | -23527| -23382) -23237 | 
| | 
0-35 | 0 60306 | 0-60351 | 0-60397 | 0-60442 | 0-60487 | 0-60531 | 0-60575 | 0-60619 | 0-60662 | 0-60706 | 0-60748 | 0-64 
| -23237 | -23092| -22946| -22801| -22655| -22509 | -22363/| -22216| -22070| -21923| -21776 
26) « é F « “ 5 -60916 -60957 | - 998| - 8| - 78| - 8 4 5 
36 | -60748| -60791 -60833 -6087 60 60 60 6103 61078 | -61118| -61158| -63 
21776 | -21629 | -21481 21334 | -21186 ‘21038 | -20890 | -20741| -20593/| -20444| -20295 | 
| | | | 
‘37 | 61158 | 61197 | -61236| -61274| -61312| -61350| -61387 61425 | -61462) -61498 | -61534| -62 
| -20295| -20146 19997 | 19848 | -19698| -19549 | ‘19399 | -19249 | 19098 | -18948 | 18798 | 
| | | | | 
38 | -61534| -61570 -61606 | -61641| -61676 61711 | -61745 | -61779| -61812| -61846| -61879| -61 
“18798 | prone -18496 | -18345| -18194 | "18043 | 17891) -17740 | -17588| -17436 | -17284 
| | 
39 «61879 | -61912 | -61944 | -61976| -62008| -62039, -62070| -62101 | -62132| -62162/ -62192 -60 
‘17284 | -17132|) -16979| -16827| -16674 “16522 | “16369 | -16216 | -16063| -15910| -15756 
| 
0-40 | 0-62192 | 0-62221 0-62251 | 0-62280 | 0-62308 | 0-62337 | 0-62365 | 0-62392 | 0-62420 | 0-62447 | 0-62474 | 0-59 
| ‘15756 | 15603 | -15449| -15295| -15141 -14987| -14833| -14679| -14525| -14370) -14216 
‘41 | -62474 | -62500/ -62526 -62552 62578 | -62603 | -62628| -62653| -62677| -62701| -62725| -58 
| 14216 | -14061 | -13906 | -13751 | -13596  -13441| -13286| -13130| -12975) -12819| -12664 
42 | -62725| -62748| -62772| -62794| -62817| -62839| -62861| -62883/| -62904| -62925| -62946  -57 
| 12664) -12508 12352 | Sanaa -12040 | -11884| -11728 “11572 | “11415 | -11259 | -11102 | 
43 | -62946 | -62966| -62986 | -63006  -63026 63045 | -63064 | -63082| -63101| -63119| -63137| -56 | 
| ‘11102 | -10945/ -10789| -10632! -10475| -10318| -10161| -10004 09846 | -09689 | -09532 
‘44.| -63137) -63154| -63171 eause | 63205 | -63221 | -63237 | -63252 63268 | 63283 | -63298| -55 
| 09532 | -09374) -09217 pat) er 08744 | -08586/| -08428 08270 | ‘08112 | -07954 
0-45 | 063298 | 0-63312 | 0-63326 | 0-63340 | 0-63354 | 0-63367 | 0-63380 | 0-63393 | 0-63405 7 | 0-63429 | 0-54 
07954 | -07796| -07638| -07480 | 07321 | ‘07163 | -07005 | -06846 | -06688 | -06529| -06370 
46 | -63429| -63441| -63452 -63463 | -63473| -63484| -63494| -63503 | -63513 | -63522| -63531| -53 
| 06370 | -06212| -06053| -05894 | -05736| -05577| -05418| -05259| -05100  -04941 | -04782 
‘47 | -63531| -63540| -63548| -63556| -63564| -63571| -63578| -63585| -63592| -63598  -63604| -52 
| 04782 | -04623| -04464| -04305| -04146| -03986| -03827| -03668| -03509| -03349| -03190 
“48 | 63604 | -63609| -63615|) -63620| -63625| -63629 -63633| -63637| -63641| -63644| -63647| -51 
| 03190 | -03031 | -02871| -02712| -02552| -02393 -02234| -02074| -01915| -01755| -01596 
49 | -63647 | -63650) -63653| -63655| -63657| -63658 | -63660| -63661| -63661| -63662| -63662| -50 
| 01596 | -01436| -01277| -01117| -00957| -00798| -00638/ -00479/| -00319| -00160| -00000 
9 8 7 6 5 4 3 2 1 0 
[es Pp 
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Part II. CoMPARISON BETWEEN THE MINIMUM NORMIT Y” ESTIMATE 
AND THE MAXIMUM LIKELIHOOD ESTIMATE 


1. Description of the investigation 


The statement in Part I (p. 411), characterizing the minimum normit x? estimate as having 
smaller variance and mean square error than the maximum likelihood estimate, was made in 
a preliminary version of the present paper, on the basis of results obtained in a planned 
programme of sampling experiments (Berkson, 1955a). The editor of Biometrika, in his 
comments on the manuscript, conveyed the suggestion that perhaps these experiments were 
not a sufficient support for so firm a statement. Originally I had thought that the statement 
was supported well enough. However, on reflexion, I realized that my own conviction was 
based, in addition to the specific sampling results cited, on cumulated auxiliary results 
gained in the course of investigations of similar questions going over many years. I decided 
that the editor’s comment was sound, for it was not to be expected that an average reader, if 
he found the conclusions surprising, would sympathetically seek out all this evidence. And 
so I was prompted to initiate a new programme of calculations, the results of which are 
presented here. These results and those obtained in previous studies are summarized in the 
concluding section. 

It was decided in the first place to calculate the bias and variance of the estimates by 
actual summation over the total sampling population for all the primary situations in which, 
previously, these statistics had been estimated with the use of stratified random samples. 
With the precise values of these statistics thus made available, it was possible to explore 
some other questions, for which even good approximations were inadequate, notably some 
questions concerned with the lower bounds of the variance and the mean square error. For 
similar reasons the distributions of the sample x?’s were investigated. In addition, the bias 
and variance for an auxiliary experiment, corresponding to the estimate of relative potency, 
were calculated using a stratified random sample. 

The situation dealt with simulates a bio-assay experiment with three equally spaced 
‘dosages’ x;, at each of which n; = 10 animals are ‘exposed’. The primary experiment was 
with dosages scaled — 1, 0, + 1 at which the true probability of response P; was respectively 
0-3, 0-5 and 0-7, which determined the parameters as « = 0, # = normit 0-7. For the same 
parameters and with the same equal spacing of the dosages, experiments were performed 
also with the probability of response at central x;, P, = 0-6, 0-7 and 0-8. Since p;, the 
proportion of animals responding at each x;, may have any of 11 possible values, the total 
number of possible sets of responses is 1331. It was, therefore, necessary to obtain the 
estimates for each of 1331 possible samples. The programme of experiments was in two 
sections, one in which / was considered known, with only « to be estimated, the other 
in which both parameters were to be estimated simultaneously. In what follows, the 
procedures which were used are described in terms of the estimation of both parameters, 
from which the corresponding procedures used for the estimation of « alone should be 
clear. 

The minimum normit x? estimates of # and a have been given in equations (5) and (6) of 
Part I above, together with definitions of p;,¢q;, X;,w;. n;, the number exposed to risk at 2;, 
is 10 for all values of 7. 

The values of X; and Z; required for evaluation of (5) and (6) were obtained as for Table 1, 





and 
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and the estimates were written to four decimal places correct to + 1 in the last place. Each 
estimate was checked by inserting the values in the estimating equations 


En,w(X,—-X,) =0, (12) 
En,w,x(X;—X;) = 0, (13) 


where X p= at prx;. Check of the estimate was performed by a different computer than the 
one who calculated its value. Because of symmetry among the samples, not all 1331 of them 
required independent computation. For estimate of both « and /, only about a quarter, and 
for estimate of « alone, about half of the samples required independent calculation, the 
estimates for the uncomputed samples being obtainable from those computed, by appro- 
priate change of sign. 

Since for p; = 0 or p; = 1, the corresponding value of X; is infinite, some rule must be 
adopted to modify these observations, if a finite estimate is to be obtained with samples 
containing either of them. I have used the rule of substituting for an observation of p; = 0, 
a working value p; = 1/(2n,) and for p; = 1, a working value p; = 1 —1/(2n,).* 

To obtain the maximum likelihood estimate an iterative procedure must be used. For the 
present experiment two methods were employed. For the most part the procedure utilizing 
the linear transform as outlined by Fisher & Yates (1953) was used, and with this the tables 
of Finney & Stevens (1948) were utilized; secondarily the method set forth by Garwood 
(1941) was employed, for which convenient tables have been provided by Cornfield & 
Mantel (1950). The two methods are fairly comparable in respect to the amount of arith- 
metical labour involved, and no regular difference was noted in regard to speed of conver- 
gence, but the linear transform method is more directly analogous to the common method of 
fitting a straight line by least squares, and therefore more familiar to the computers—this 
was the chief reason for preferring it. The estimates as used for calculation of the statistics 
were to be correct to better than + 1 in the third decimal place and therefore were set down 
with four decimal places, the last being correct to +5. In order to attain estimates of the 
desired precision, the following procedure was used: with @ and By (provisional values of 
the estimates) decided on, 02 and op ( the respective corrections to these for improved esti- 
mates) were computed using the Finney—Stevens tables. If either 02 or ap was as large as 
0-01, another iteration was performed in the same way, and this process was continued until 
both 0@ and ap were less than 0-01. The argument of the Finney—Stevens tables is carried to 
two decimal places only and cannot conveniently be used beyond this stage of the iterations. 
From this point on, the computations were accomplished using the W.P.A. tables (Lowan, 


* This is an old ‘empirical’ rule, the origin of which I do not know. Another rule has been advanced 
by Reed (1936), still another by Gaddum (1933), and doubtless there are others besides these which have 
beenemployed. Elsewhere I (Berkson, 19556) have set forth some considerations on the basis of which the 
2n rule may be considered reasonable, but I do not advance them as necessarily preferable to any other. 
It should be noted that the contrast between the minimum normit x? estimate and the maximum 
likelihood estimate does not essentially consist in the different methods used for dealing with the 
observations of zero or 100 % response. In maximum likelihood estimation the rule used is the applica- 
tion, to the particular observations of zero or 100 % response, of a modification required to be applied 
to all observations (Fisher, 1954}, ~yhereas in minimum normit x? estimation, except for the observa- 
tions of zero and 100 %, there is no modification of the observations at all—the working value is simply 
the observation itself. And of course there is the difference in the weights used. In maximum likelihood 
estimation the weight for unit observation is 42/64, where the capped quantities refer to the corre- 
sponding maximum likelihood estimates, whereas in the minimum normit x? estimation the unit 
weight is Z?/pg, where the quantities represented correspond to the observations directly. 
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1942), which are more detailed though less convenient than the Finney-Stevens tables. 
Iteration was continued until both 0@ and ap were less than 0-0005. The estimates obtained 
with these adjustments, written to four decimal places,* were checked by insertion in the 
estimating equations nel sas 3 
X(n;Z,/P;Q;)(p;—P;) = 9, (14) 
X(n;%;Z,/P;Q;)(p;—P;) = 0, (15) 
where P, =1- 0, is the estimate of P,, the probability at x;, given by the estimated para- 
meters, Z, is the corresponding normal ordinate and p; is the sample value of the relative 
frequency. As in the case with the minimum normit y? estimate, independent computation 
was necessary for only about half of the samples for the estimate of « alone, and for about 
a quarter of the samples for the estimate of both parameters. 

The number of iterative cycles required depends of course on how good an approximation 
the provisional values &p, By are of the maximum likelihood estimate. For the present 
experiment, the samples were ordered systematically to facilitate computation, each sample 
differing from the previous one in only a single value of the observed frequency. This enabled 
the computers to guess the values of the provisional estimates from the precise solution of 
the previous sample. Even with this arrangement the number of iterative cycles required 
varied considerably. In some regions the procedure required two cycles with the Finney- 
Stevens tables, followed by two with the W.P.A. tables, in others two with the Finney- 
Stevens tables, and only one with the W.P.A. tables. At most we used five cycles, while some 
samples required only two cycles. However, so much depends on the sample, and on how 
well the provisional values are estimated, that really, no good judgement can be made as to 
how many iterative cycles are required to obtain a satisfactory maximum likelihood 
estimate. Indeed, the point to stress is that one cannot depend on just how many cycles have 
been calculated. What one requires is some objective methodical scheme by which one can 
recognize whether another cycle is necessary. 

Each estimate was punched into a Hollerith punch card, together with its sample for 
identification. By contract with a service bureau of the International Business Machines 
Corporation, the square of the estimate of each parameter and their product were calculated 
with the use of IBM calculating machine number 164 and checked with the same machine, 
and the several results were punched into the same card. Other multiplications required for 
the calculations were performed in the same way. 

The 1331 cards thus prepared, each punched with its sample, the estimate 2, the estimate 
B, the square of each, and their product, constituted the basic instrument for the rest of the 
computations. Corresponding to each dosage arrangement for which statistics were to be 
computed, it was necessary now to calculate the probability of each sample. For instance, in 
the dosage arrangement with P, = 0-5, the probability P; at the lowest dosage is 0-3 and the 
probability P, at the highest dosage is 0-7. For each of the eleven possible samples at each 
dosage the probability was directly calculated, checked, and hand-punched into auxiliary 


* In the corresponding experiment with the logistic function (Berkson, 19556), the estimates were 
computed, correct to one more place than in the present experiment. However, this was really more 
than was necessary for the purposes in hand, and the normit function is considerably more difficult to 
compute than the logit function. Actually, essentially the same results would have been obtained, so 
far as concerns the calculated values of the mean square error, if even one less place were retained, but 
it was considered desirable that the estimates should be calculated as precisely as practicable with the 
available computing facilities. 
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cards. Each combination of three of these, one corresponding to each of the three dosages, 
was multiplied together in two steps with the use of the automatic multiplying punch 
machine to give the probability of each sample, and each multiplication was checked. The 
resulting 1331 probabilities were transferred to the corresponding main cards with the use 
of a reproducing punch machine preparatory to multiplying them by the estimates, their 
squares and products. These multiplications were accomplished with the use of the auto- 
matic multiplying punch, and the results were punched into the same main card and checked. 
The sums of estimates, sums of squares and sums of products necessary for calculation of the 
required statistics were now obtained by totalling these quantities with the use of an available 
IBM punch-card tabulator. 

For dosage arrangement with probability of the central dosage P, = 0-5, the e dosage values 
a,at the successive dosages were taken as — 1,0, + 1, and the values of 2 and B, the estimates 
of a and , respectively, which were punched into the master cards, were entered as for this 
dosage arrangement. With other equally spaced dosage arrangements, for any specific 
sample, B, the estimate of /, will be the same, but the estimate of « will be different and is 


given by = % ,—C B, (16) 


where &.; is the estimate for that sample with P, = 0-5, and C = fx normit of P.. 
Accordingly the estimates did not have to be computed anew for each new experimental 
set-up, since all the experiments dealt with were for three dosages, equally spaced. It was 
necessary, however, to calculate anew the probability of each sample, from the probabilities 
eebenting to the new dosage positions. * These were punched into the main cards, the 
values &.5, &3.55 BB, Go. sP having already been punched into them. The five multiplications 
of the sample probability by these quantities were accomplished, and the results were 
punched into the cards and checked as for the calculations with P, = 0-5. By totalling these, 
the required sums of estimates, the sums of squares and the sums of products are obtained, 
and from them the required statistics were computed for the new dosage arrangement, taking 
into account relation (16). - 


2. Results 


The primary purpose of these experiments was to compare the error of the minimum 
normit x? estimate with the error of the maximum likelihood estimate. An indefinite 
number of criteria for such a comparison are conceivable; I used the classic measure of error, 
the mean square error, that is, the expected value of the square of the difference of the 
estimate from the true value of the parameter. 

This cannot be the place for an adequate discussion of such a fundamental idea as the 
concept of error. However, I should like to note that at the beginning of these investigations, 


* It is of some interest to note that this is not the case for the equivalent situation with the logistic 
function. That function has simple sufficient statistics'which for « and f are respectively Er, and 
ir,z, Accordingly the relative frequency of samples, within sets of samples having the same value for 
these statistics, remains the same, independent of the change of parameters. The total probability of 
such a set, with a change of parameter, therefore can be obtained from the probability with the original 
value of the parameter, by multiplication with a determinable constant. A change of dosage arrange- 
ment is equivalent to a change of the parameter a, and for the experiments with « to be estimated, 
f being known, since there are only 31 sets of samples defined by =r,, the required probabilities with 
change of dosage arrangement can be determined by 31 simple multiplications instead of 1331 more 
involved multiplications. This facilitation illustrates the great general statistical advantage of 
working with a function that possesses simple sufficient statistics for estimating its parameters. 
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I did consider and make calculations on the basis of other criteria, as well as with the mean 
square error. In particular I explored the idea of comparing the discrepancy of the estimates 
from their parameters directly in terms of the distribution of the estimates. This is related to 
what Savage (1954) calls the ‘principle of inadmissibility’, and to what Pitman (1937) has 
investigated as the ‘closeness’ of an estimate. According to this principle, if 7’ is an estimate 
of @ and e is the error (7' — @) of estimate, then if the probability of an error equal to or greater 
than |e | is smaller for an estimator 7’, than for T, (excluding the values of e for which the pro- 
babilities are equal) for all values of e and for all values of 0, then 7’, is a better estimator than 
T, in an absolute sense. Intuitively this principle seems incapable of denial, and I am not 
disposed to question it here. However, I do not believe that it can be very useful in providing 
a criterion for comparison of estimates in situations which are met in practice. For it will 
only rarely occur that two seriously recommended estimates are comparable in this respect 
at all; it will not even be typical for one to be better than the other in this stringent sense at 
any given @, let alone one being better than the other in this sense at all 0. 

If for all values of 0, T, and 7, are normally distributed and are both unbiased, and the 
variance of 7; is less than the variance of 7), then T, is better than 7, in the absolute sense 
defined. So, for instance, the estimate of the mean of a normal distribution, from the mean 
of a sample based on 7, >, is better, in the absolute sense, than an estimate based on 
a sample of n,. But this is a case with the estimate represented by a continuous variate. If 
we consider a discontinuous variate, for instance the estimate of a probability P, with the 
usual estimate 7’ = s/n, where s is the number of observed events in a sample of n, the case is 
already not so simple. If for T,,n = 3, and for T,, n = 2, then both estimators are unbiased, 
and the variance of 7’, is less than that of 7,. But 7; is not necessarily better than T, in the 
absolute sense defined, as can be seen if we consider the case with P = 0-5. In this case the 
probability of 7, yielding the correct value of the parameter is zero, while for 7, it is 0-5, so 
that, for instance, the probability of an error equal to or greater than | 0-1| is 1 for 7, and 
0-5 for T,. In comparing the minimum normit y? estimate with the maximum likelihood 
estimate in terms of the distribution of the estimates, I encountered just such ‘paradoxes’. 
Although, philosophically the mean square error cannot be considered necessarily the only 
relevant measure of error, I hardly believe that a better criterion for general application will 
be found. This is to say, if for some practical situation, an estimator T, is found to have 
a smaller mean square error than 7), for all reasonable values of the parameter, I doubt that 
T, will be found better than 7, by some other reasonable criterion. 

Both estimates are asymptotically efficient (Taylor, 1953a). This is interpreted to mean 
that with x,’s fixed, as the number n; at each x; is increased indefinitely, the estimate 
approaches a normal distribution the variance of which is equal to or less than that of any 
other similarly distributed estimate. It is to be noted that the properties refer to the never- 
attained limiting distribution and that mathematically asymptotic efficiency does not 
imply necessarily any attribute of the estimate with finite samples. However, it is reasonable 
to assume, for practical cases like the present one, that for large n; at each x;, both estimates 
will have about the same variance. In one experiment with three equally spaced doses, 
corresponding respectively to true P’s 0-3, 0-5, 0-7, 50 at each dose, using an amply large 
stratified random sample, this was corroborated. With f = 0-524401, considered known and 
a = Oto be estimated, the theoretical value of the asymptotic variance correct to six decimal 
places was 0-011186, the determined value of the variance for the minimum normit \* 
estimate was 0-011186, that is the same, to that degree of precision, while that for the 
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maximum likelihood estimate was only slightly higher, namely 0-011191 (Berkson, 1955a). 
For both estimators, to four significant figures and five decimal places the variance had 
attained its asymptotic value. 

With small samples, comparison of the error of the estimates is complicated by the fact 
that for some samples the maximum likelihood estimate is infinite. In each such case the 
sample was omitted in computing the statistics of the minimum normit ? estimate as well as 
of the maximum likelihood estimate. With probability at central dosage P, = 0-5, either in 
the case of estimating only « or in that of estimating both parameters, the frequency of such 
samples in the sampling population is so very small that their omission can reasonably be 
considered not to affect the virtual comparison of the estimates. But as the dosage arrange- 
ment is changed from the symmetrical one with P, = 0-5, to asymmetrical arrangements with 
P, appreciably different from 0-5, the probability of such samples increases, and the calcula- 
tion of the statistics omitting them becomes questionable. I have made comparisons only 
for situations in which the samples yielding an infinite estimate by maximum likelihood 
constitute less than 5° of the sampling distribution. When both parameters are to be 
estimated, with P, = 0-9, 25-1 °% of the samples yield infinite estimates by maximum likeli- 
hood. The upper limit of the experimental arrangements was therefore taken as the one 
corresponding to P, = 0-8, for which the probability of samples with an infinite maximum 
likelihood estimate is 4-4°%. In order to co-ordinate the computation, the experiments 
estimating only « were limited to this region also. Only experiments with P,>0-5 are 
described, since the relation between the estimators is the same for any P’,< 0-5 as for the 
one with P, = 1—P;. 

The optimum arrangement for a bio-assay experiment is with dosages x symmetrically 
disposed around the value for which P = 0-5, for it is with this arrangement that the variances 
of the estimates are smallest. Of course, without knowing the values of the parameters 
exactly an experiment cannot be accomplished with just this disposition of dosages, but by 
preliminary experimentation the position of x for which P = 0-5 can be located approxi- 
mately, and a well-designed experiment should not be done with central x very different 
from this value. In a sense, therefore, the comparison of the estimates with each other when 
P, = 0-5 is the most important one, since it compares the estimates in the ideal experiment 
from which good experiments will not be far removed. With this experimental arrangement, 
the estimate of « is unbiased, both for the minimum normit x? and for the maximum likeli- 
hood estimator, whether it is estimated alone or simultaneously with £. For £, however, even 
in this arrangement, both estimators are biased, and for all other arrangements the estimate 
of each parameter is biased for both estimators. Unbiased estimation, therefore, is seen here 
to be exceptional, limited to ~, and even in that case only with the model experiment.* 

The results are shown in detail in Table 4 and Table 5. For the case of estimating « with # 
known (Table 4), the bias is positive for the maximum likelihood estimate, and regative for 
the minimum normit y* estimate, the absolute value of the latter being the larger of the two. 
The variance is in all cases smaller for the minimum normit x? estimate, and sufficiently 
smaller to outbalance the larger bias, the net result being a smaller mean square error for the 

minimum normit x” estimate in all the experiments. In the case of « and f to be estimated 
simultaneously (Table 5), the bias is negative for « and positive for ?, except for the minimum 

* I do not mean to imply that unbiasedness is necessarily a desideratum. It is not obvious that, say, 


an unbiased estimator 7', whose variance is larger than the mean square error of a biased estimator 7’,, 
is preferable to it. I should prefer 7’. 
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normit x? estimate, which at P, = 0-8 is positively biased for a and negatively for . The 
variance is smaller for the minimum normit y? estimate, of either parameter, alone or both 
simultaneously, and with all dosage arrangements, and so also is the mean square error. It 
may be noted that whereas in the case of « alone to be estimated, the minimum normit y? 
estimate achieves a smaller mean square error in spite of its bias having a larger absolute 
value than the maximum likelihood estimate, in the case of both « and f to be estimated, for 


Table 4. Estimate of « with B known 


Based on the total sampling population. The samples with infinite estimate by maximum likelihood 
are omitted in calculating all statistics. The greatest frequency of such samples is 0-04 %, occurring in 
the experiment with P, = 0-8. 























| 
True P at dose Bias Variance Hous aquere | Moon ett 
error | error+1/I 
— : : ) TT emanate 
| Max. | Min. Max. | Min. Max. | Min. | Max. | Min. 
Low | Mid. High | likeli- | normit | likeli- | normit | likeli- | normit ' likeli-  nor- | 
| hood | x? hood x hood - | hood | mit y* 
| | | | | 
| | | , 
0-3 0-5 | 0-7 0 0 0-0589 | 0-0537 | 0-0589 | 0-0537 | 0-0559 | 1-054 | 0-961 
0-393 | 0-6 | 0-782 | 0-0068 | —0-0073 | 0-0606 | 0-0546 | 0-0606 | 0-0547 | 0-0571 | 1-061 0-958 | 
0-5 0-7 | 0-853 | 0-0152 | —0-0177 | 0-0663 | 0-0572 | 0-0665 | 0-0575 | 0-0612 | 1-087 | 0-940 
0-624 | 0-8 0-914 | 0-0287 | —0-0401 | 0-0802 | 0-0598 | 0-0810 | 0-0614 | 0-0706 | 1-147 0-870 
| | | | 














Table 5. Estimate of « and f simultaneously 


Based on the total sampling population. The samples with infinite estimate by maximum likelihood 
are omitted in calculating all statistics, the frequency of such samples being shown in the last column. 























True P at dose Bias Variance Mean square error 9, 
0 

| ] == | samples | 

| Max. | Min. | Max. | Min. | Max. | Min, | soluble 

Low | Mid.| High | likeli- | normit | likeli- | normit | likeli- | normit as owes 
| | hood | x* hood x? hood | x? eines 








Estimate of a 





0-3 | 0-5 0-7 0 | O 0-0675 | 0-0588 0-0675 0-0588 0-09 

0-393 | 0-6 0-782 | —0-0027 | — 0-0079 0-0880 | 0-0791 0-0880 0-0792 0-11 

0-5 | 0-7 0-853 | —0-0135 | —0-0041 0-1587 0-1485 0-1589 0-1485 0-60 
| 


0-624 0-8 | 0-914 | —0-6278 | 0-0517 0-3698 0-2953 0-3706 0-2980 4-43 


Estimate of £ 


0-3 | 0-5 | 0-7 0-0467 | 0-0288 0-1109 | 0-0947 0-1131 0-0955 0-09 
0-393 | 0-6 0-782 0-0505 | 0-0234 0-1172 0-0936 0-1198 0-0941 0-11 
0-5 0-7 | 0-853 0-0588 | 0-0018 0-1339 0-0898 0-1374 0-0898 0-60 
0-624 0-8 | 0-914 0-0526 | —0-0564 0-1494 0-0824 0-1522 0-0856 4-43 











retail 
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a number of the dosage arrangements the absolute value of the bias as well as the variance 
is smaller for the minimum normit x? estimate. 


3. Lower bound of mean square error 
In the case of one parameter to be estimated, with certain regularity conditions obtaining 
which are well fulfilled in the present problem, the lower bound for the mean square error of 
an estimate 7' of a parameter 0 is given by* 


2 
E(T-6)>"" — +b 


, (17) 
and the lower bound of the variance is the first term of the right-hand side of (17), where 
b = E(T — 6) is the bias of the estimate 7’, and J is Fisher’s amount of extractable informa- 
tion, the value of which is 





oln¢g\? 
where ¢ is the probability of the sample. The parameter to be estimated being a, the value of 
Tis ne 
I(«) = 230," (19) 


For the evaluation of the lower bound (17), J («) can be calculated directly, but 0b/d« is 
also required. This was evaluated by plotting the bias b against « and estimating the slope 
of the bias function graphically, at required values of a.+ The bias function for both 
estimators is shown in Fig. 2. 
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ont an gd 


Maximum likelihood 
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« = normit of P. 





Fig. 2. Bias of & in relation to a; estimate of « with # known. The evaluation of 0b/é« required for 
calculation of the lower bound (17), was obtained graphically from an enlarged replica of this 
figure. 


* This inequality is frequently referred to as the ‘Cramér-Rao lower bound’. It was, however, 
developed by Fréchet, and in a very clear way, earlier than by these authors (Savage, 1954). 

+ Change of the dosage arrangement from one with P, = 0-5 to some other value of P, is equivalent to 
retaining the original dosage disposition and changing a from « = 0 to « = normit of P,. 
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The lower bounds of the maximum likelihood estimator and the minimum normit y? 
estimator are given in Table 6, and shown in Fig. 3 together with the corresponding mean 
square errors. It is seen that neither estimate attains the lower bound for its mean square Rele 
error, but that the lower bound of the minimum normit y? estimator is lower than that of goodn 
the maximum likelihood estimator. The attainment of a smaller lower bound by the mini- of Kar 
mum normit y? estimator is directly related to the character of its bias function, the first 
derivative of which is everywhere negative, while that of the maximum likelihood estimator 
is everywhere positive (Fig. 2). where 
‘expec 
(1) usi 
mg: | signifi 
of free 
5 estime 
Mean square error— It i 
3 rT ‘“\ Lower bound then c 
’ sampl 
z Maximum likelihood functi 
§ x? of | 
= Mean square error 
0:06 
which 
Lower bound totica 
; norm: 
Minimum normit x2 
0:05 \ ! \ caleul 
05 06 07 08 
P of central dose 
1 1 1 i 1 4 i i 1 1 1 1 1 1 1 1 1 alk 
0 1 ~020~C«i Stitt‘ St(‘i«tG:C«CitSté‘<C<«t SCD 
a = normit of Pe Wher 
Fig. 3. Mean square error and lower bound of the mean square error; calcu! 
estimate of a with # known. distri 
; again 
Table 6. Lower bound of the mean square error; estimate of a with B known asym 
| used 1 
True P at dose Maximum likelihood Minimum normit 7? tion ¢ 
| 
S =o ae ee : ieeeduil , — In 
_ Low | Mid. High L.B. M.S.E. L.B. | M.S.E. — 
| | norm 
sr we << 7 ie i, | ok aliail sae, estim 
| 0-3 | O65 0-7 00585 | 00589 0-0531 0-0537 Fo 
| 0-393 0-6 0-782 0-0603 0-0606 0-0536 0-0547 weed 
| 05 | O7 | 0-853 0-0656 0-0665 0-0555 0-0575 : 
| 0-624 = 08 0-914 0-0784 0-0810 00587 | 0-0614 toget 
iebonet easily 
samy 
We may also note the relation between the variance and 1/J, which is sometimes treated funct 
as the variance equivalent of the amount of extractable information and is in fact the lower some 
bound of the variance for an unbiased estimator. At all the dosage arrangements of the 
experiments, the variance of the minimum normit y? estimate is smaller and that of the *] 
maximum likelihood estimator is larger than 1/J (Table 4). num 
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4. Distribution of x? 


Related to estimation of the parameters in the situation of bio-assay is the testing of 
goodness of fit by the chi-square test. The standard practice is to calculate the classic x? 
of Karl Pearson 

2 (0; hie e;)* 
seal ee ae 


e; 


, (20) 


where, in the present situation, 0; is the observed number of responses at x;, and e; is the 
‘expected’ number, computed as e; = P,n,;, where P; is the probability at 2; calculated from 
(1) using the estimated values of the parameters to replace the true values. The test of 
significance is obtained by reference to a table of the distribution of chi-square* for degrees 
of freedom equal to the number of dosages x; less the number of parameters which have been 
estimated. 

It is well known that the x? of (20) is only asymptotically distributed as chi-square, and 
then only if the estimates of the parameters are asymptotically efficient estimates. For finite 
samples it is so distributed only approximately. Now the classic x? of Pearson is not the only 
function of the observations which is asymptotically distributed as chi-square. The normit 
x of (4), x? (normit) = in;w,(X; _ X,), 
which is the quantity minimized to obtain the minimum normit x? estimate, is also asymp- 
totically distributed as chi-square. If the parameters have been estimated by minimum 
normit x”, the normit y? is somewhat easier to compute than the Pearson y?, since it can be 
calculated directly using the estimated parameters by the familiar formula 


v= En, w,(X;—X)?—fIn,w(X,—X) (x, —3). (21) 


When the parameters have been estimated in this way, therefore, it seems natural to 
calculate the normit x? for the chi-square test, and the question arises as to whether it is 
distributed as closely to the asymptotic chi-square distribution as is the Pearson y?. Then 
again, since the minimum normit y? estimate and the maximum likelihood estimate are both 
asymptotically efficient, the question arises as to whether perhaps one rather than the other, 
used to calculate the expected numbers e; in the Pearson x? of (20), gives a better approxima- 
tion of the theoretical chi-square distribution. 

In order to shed some light on these questions, the distributions of three sample y? functions 
were calculated for the experiment with P, = 0-5 in the case with a alone to be estimated : the 
normit x? for the minimum normit x? estimates, the Pearson x? for the minimum normit y? 
estimates, and the Pearson x? for the maximum likelihood estimates. 

For each sample, the y? was computed, using the estimates with four decimal places and 
writing the calculated y? to three decimal places. The result was punched into a card 
together with the sample designation and its probability. The distribution of the x? was then 
easily obtained by cumulating the probabilities progressively on the tabulator, with the 
samples ordered on the value of the x?. The cumulative distributions for the sample x? 
functions are shown in Fig. 4 together with the asymptotic chi-square distribution, and 
some selected values are listed numerically in Table 7. 


* By the distribution of chi-square we mean the exact distribution of the sum of squares of a 
number of independent normal deviates having unit standard deviations. 
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Table 7. Distribution of sample x? functions for experiment with P, = 0-5, a alone to be 
estimated; comparison of the percentage frequencies with the theoretical frequencies 


The entries give in % for the respective ys, Pr(xy*>xo), where in the theoretical asymptotic 
chi-square distribution for 2 degrees of freedom, Pr (x?> 3) = L %. 





























Max. likelihood Min. normit x? estimate 
estimate 
Level L 2 
(% 7 
Pearson x? Pearson x? Normit y? 
25 2-773 26-75 | 26-66 24-02 
20 3-219 | 22-69 22-22 20-96 
15 3-794 15-65 15-49 11-76 
10 4-605 | 9-65 9-64 | 8-15 
5 5-992 5-14 5-12 | 2-70 
1 9-210 0-79 | 0-81 | 0-39 
0-5 10-597 0:46 | 0:46 0-14 


| 
| 
| | 








Inspection of the normit x? distribution in Fig. 4 and Table 7 reveals rather wide dis- 
crepancy of the distribution frequencies from those of the asymptotic chi-square distribu- 
tion. The frequencies for specific points of v2 are sometimes higher, sometimes lower than the 
asymptotic frequency, until about the 15 % level. From that point forward the normit y? fre- 
quency is consistently too low; at the 5% level the frequency is 2-7%, and at the 1% level the 
frequency is 0-4°%. If, therefore, the normit x? were used for a test at either of these levels, 
the tested hypothesis would be ‘rejected’ considerably less frequently than required by 
theory. The Pearson x?, for the same minimum normit y? estimates, is seen to be in much 
better agreement with the asymptotic chi-square distribution in the region below the 15% 
level. For the 5 % level the frequency is 5-1 % and for the 1 % level it is 0-8 4. The normit y* 
should therefore not be used in preference to the Pearson x?, for a y? test of significance, even 
though it is somewhat easier to calculate. This conclusion is in line with that of David (1950), 
who condemned the use of the Neyman reduced y? for a chi-square test of significance, even 
when there was good reason for preferring an estimate obtained by minimizing that ?. 

Comparison of the distribution of the Pearson x? in Fig. 4 and Table 7, for the minimum 
normit x? estimate and the maximum likelihood estimate, shows them to be closely similar 
in their approximation of the asymptotic chi-square distribution. At the 5% and the 1% 
levels the approximation is somewhat closer for the minimum normit y? estimate than for 
the maximum likelihood estimate. Thus, the advantage of the minimum normit x? estimate 
over the maximum likelihood estimate found in respect of the variance and mean square 
error is supported, or at least not offset, by the approximation of its x? distribution to the 
asymptotic distribution. 

5. Conclusion 


It has been mathematically demonstrated (Taylor, 1953a) that the minimum normit y° 
estimate falls in the class R.B.A.N. (regular best asymptotically normal) of Neyman. The 
estimate is therefore asymptotically efficient, and in ‘large sample theory’ it is equivalent to 
the maximum likelihood estimate. In large samples, as this phrase commonly is used in 
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Normit X%2 with minimum normit X2 estimate 
































Xo 
4, Distribution of sample x? functions; estimate of « with # known; the theoretical 
asymptotic distribution is the chi-square distribution for 2 degrees of freedom. 


statistical literature, both estimates are normally distributed, with mean equal to the 
parameter estimated and variance given by 


where 


28 


o%(&) = 1/2(n,Z3/(P;Q,)), (22) 
o%(B) = 1/2(n,Z3/(P,Q,)) (e;—2)*, (23) 
t= x(n; Z3/(P;Q;)) a,/X(n;Z7/(P;Q;)). (24) 


Riom. 44 
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For small samples, we have found in the experiments presented that the mean square 
error of the minimum normit y? estimate is smaller than that of the maximum likelihood 
estimate. Since this conclusion is based on numerical findings in specific situations and not 
on a mathematical demonstration, the question may be asked whether it can be considered 
proved, and how general it may be regarded as being. Of course no specific numerical results 
can be considered to be a proof. Even so far as concerns the particular dosage arrangements 
for which the calculations have been made, there is the possibility that some arithmetical 
errors have been committed. All that can be said on this point is that it is unlikely in the 
extreme that arithmetical errors have been made that are of so serious a nature that their 
correction would reverse this conclusion. Care has been taken, so far as practicable, to check 
each arithmetical operation involved in the calculation, and in addition over-all indirect 
checks have been applied whenever the possibility presented itself. The ratios of and dif- 
ferences between the mean square errors of the two estimates, as they have emerged from 
these calculations, change in a regular way with change in the assumed values of the 
parameters. The conclusions are essentially the same as those previously reached indepen- 
dently with the use of carefully planned samples. The results are in agreement, in general 
and in detail, with those obtained in investigations of the logistic function, which is very 
similar in curvature characteristics to the integrated normal function. So far, therefore, as 
the specific experimental conditions in which these investigations have been made are 
concerned, there seems no reasonable doubt but that the mean square error of the maximum 
likelihood estimate is larger than that of the minimum normit x? estimate. 


Table 8. Estimate of « and £ simultaneously, with dosages spread 


P, = 0-5, the lower dosage is at P; = 0-1, the upper is at P; = 0-9 instead of at 0-3 and 0-7 as in the 
standard experiments. The samples with infinite estimate by maximum likelihood constitute 12-2 % of 
the sampling population and are omitted in calculating all statistics. The results are based on the total 
sampling population. 














d B 
= | 
Statistic | | 
| Max. | Min | Max. | Min 
| likelihood | normit x* | likelihood | _normit x# 
re eee Serene ee Wee ele REI ORRS Set ee 
Bias 0 | 0 | 0-0507 | — 0-0983 
Variance 0-1052 | 0-0562 | 0-1246 | 0-0566 
Mean square error | 0-1052 0-0562 | 0-1272 0-0663 
| 
i 





A further question that may be asked is whether the same conclusion applies to experi- 
ments with a different arrangement of dosages. It was early suggested in connexion with 
the logistic function that the relation between the mean square errors of the two estimates 
might be reversed, if the dosages were arranged with a wider ‘spread’. For this reason 
a calculation was made, using the total population of samples, for an experiment with 
P, = 0-5, but with the lower dosage at P, = 0-1 and the upper dosage at P, = 0-9, instead of 
at 0-3 and 0-7, respectively. The results (Table 8) not only confirmed the larger mean square 
error of the maximum likelihood estimate, but showed it to be in larger ratio to that of the 
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minimum normit x? than with the more narrowly disposed dosages, and moreover pointed 
out more emphatically the difficulty encountered in the maximum likelihood estimator’s 
yielding infinite estimates. 

Similarly, it was suggested that a different sort of estimate, such as a bio-assay estimate 
of relative potency, might show the smaller error to be with the maximum likelihood 
estimator. A calculation was therefore made for an experiment simulating a four-point 
parallel assay. In such an experiment a mixture of unknown strength is to be assayed as to 
its relative potency with respect to another mixture considered as standard. For the stan- 
dard mixture, two concentrations x, and x, measured in logarithmic units are used, and n; 
animals exposed with each; for the unknown, the same two concentrations x, and 2, are used, 
and n; exposed at each. Thus there are four relative frequencies observed, two corresponding 
to the standard mixture, two to the unknown. Two normit lines (« + fx;) are fitted, one to 
the points of the standard mixture, one to the points of the unknown, these lines constrained 
to be parallel, so that B is the same for both. The distance between the fitted parallel lines, in 
units of x, is the difference of the logarithms of the dosage of the unknown and standard 
respectively, which produces the same response. This is taken as the measure of the logarithm 
of the relative potency, and given by M = (@,,— &,)/2, where &,, is the estimate of « corre- 
sponding to the normit line of the unknown, and @, is the similar estimate corresponding to 
the line of the standard mixture (see Finney, 1947). The estimate, based on 109 stratified 
random samples, showed the maximum likelihood estimator to have the larger mean square 
error (Table 9). 


Table 9. Estimate of log relative potency M, with a four-point parallel assay 


Simulated experiment with ‘standard’ and ‘unknown’ taken as of equal potency, M = 0. For both, 
lower dosage x, at P; = 0-3, upper dosage 2, at P;= 0-7. n;= 10 for each of the four observations. 
Based on 100 stratified random samples of the four-point assay. 














| 
Estimator | Bias* Variance M.S.E. 
Le a sorta at Pa ES | Se Suess | sid 
Maximum likelihood — 0-014 0-185 0-185 
Minimum normit x? —0-013 0-175 | 0-176 





* The expected value is actually zero; the empirical figures are to be attributed to sampling 
fluctuation. 


Lastly, it was suggested that the relative results would be reversed if the number of doses 
were greatly increased. In this connexion specific results can be quoted only for experiments 
with the logistic function. Since in other experiments performed with both functions the 
results are similar, these findings with the logistic function are relevant. Experiments 
comparing the maximum likelihood estimator and the minimum logit x? estimator, with 
dosages up to 11 (Berkson, 19556), and experiments comparing the maximum likelihood 
estimator with the minimum Pearson y? estimator with dosages up to 100 (Taylor, 19535), 
showed the maximum likelihood estimator to have the larger mean square error. 

In spite of all the corroboratory findings, in a variety of experimental conditions, of the 
larger mean square error of the maximum likelihood estimator, the possibility is not excluded 
that some conditions may be definable in which the minimum normit x? estimator has the 
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larger error. It seems unlikely, however, that if such conditions are discovered, they will 
correspond to any met in practice in a well-designed bio-assay or in similar assay. 

Even with all this, the important finding of the present investigation is not that the 
minimum normit y? estimator has a smaller error than the maximum likelihood estimator, 
even if this is accepted in full generality, but rather that the maximum likelihood estimator 
does not have the smaller error in all circumstances. This last really did not require proof, 
since no serious evidence to the contrary had ever been presented. What does require wider 
appreciation among statistical writers is that it is unwarranted to claim that some presently 
known principle of estimation necessarily always yields the best estimate. If in the present 
circumstances the minimum normit y? estimate is easier to compute, and at the same time 
is a better estimate than that provided by the maximum likelihood estimator, there may be 
other methods of estimate—and I believe there are— which are even easier to compute and, 
in the practical circumstances in which they are applied, may be better than the minimum 
normit y? estimator. 


To the many young ladies and gentlemen, who, over the last 8 years, have helped in the 
calculations required for the investigations referred to in this paper, I tender my thanks. 
Some, but not all, of the individuals are: Gretchen Eusterman, Gudula Fischer, Marjorie 
Heiges, Bertram Haines, Ann Hogben, David Hogben, Richard Kersten, Beverley New- 
comb, Fred Schwarz, David Sperling. Especially I owe gratitude to Mrs Helen Golenzer, 
who has been with the work from the beginning, and whose talent for orderly filing, and 
clairvoyant ability to locate any desired sheet, no matter how falsely identified by me, made 
progress possible, and prevented Bedlam from supplanting the Mayo Clinic. - 
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SHORTER INTERVALS FOR THE PARAMETER OF THE BINOMIAL 
AND POISSON DISTRIBUTIONS 


By W. L. STEVENS* 


SUMMARY. Different methods are examined for determining, for the binomial and Poisson 
parameters, narrower intervals than those furnished by the classical method. A way is found for 
calculating, from existing tables, the limits which are produced by the device of adding, to the 
observation, a randomly chosen number between 0 and 1. 


1. INTRODUCTION 


Some recent papers (Sterne, 1954; Crow, 1956) show that statisticians are concerning them- 
selves with the problem of how to construct intervals for the parameter of the binomial 
distribution, narrower than those found by the usual method (Clopper & Pearson, 1934; 
Fisher & Yates, 1943, Table VIII, I). An interesting solution to this problem, with the 
necessary tables, has appeared in the paper by Crow. Itis, however, to be questioned whether 
the proposed method (which is based on re-ordering the significance of observations in the 
tails) is satisfactory for all purposes. The intervals are indeed narrower than those found by 
the usual method, and it is true that the probability that the interval will cover the true 
value is not less than that declared. We believe, however, that one often requires more than 
this: one wants to know separately the probabilities or bounds to the probabilities, that the 
value of the parameter is below the lower limit or above the upper limit. This is certainly 
true when we are interested in only one limit—a case which is quite common in practice; for 
example, a purchaser of electric lamps is interested only in the upper limit of the percentage 
defective. It would appear that no statements of the separate probabilities that the value of 
the parameter lies below the lower limit or above the upper can be made in respect of the 
intervals tabulated by Crow. 

There exists, however, another and perhaps more elegant method for constructing 
shorter intervals (Anscombe, 1948; Stevens, 1950; Tocher, 1950; Eudey, 1949; Pearson, 
1950), which possess the two desirable properties: (1) that statements of probabilities 
are equalities instead of inequalities, and (2) that the probabilities are known, separately 
of the value of the parameter lying below the lower limit or above the upper limit. 
Even more, Blank (1956) has shown that the method possesses theoretically optimal 
properties. 

The method, stated briefly in relation to the binomial, though applicable to any discon- 
tinuous distribution, is as follows: we draw n balls from an infinite population of black and 
white balls and find that f of them are black. We then draw a number z from the rectangular 
distribution, zero to one. Then y = f+ xis a real number between 0 and n + 1, from which we 
can find 7, and 7,, respectively, the lower and upper limits to the parameter 7, the proportion 
of black balls in the population. The probabilities that the value of 7 lies below the lower 
limit, between the two, or above the upper limit, will be known exactly. 


* Faculdade de Filosofia, Ciéncias e Letras, Universidade de Sao Paulo, Rua Maria Antonia 294, Sio 
Paulo, Brazil. 
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2. CALCULATION OF LIMITS 


The functional relation between 7, the lower 10 % limit, and y, for = 10, is illustrated in 
the graph, based on a table in Stevens (1950). It is seen to be composed of arcs concave 
upwards linking the points corresponding to integral values of y. The function is continuous 
but the first derivative discontinuous at these points. 
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Fig. 1. Portion of the graph showing relation between lower 10% limit (n = 10) and y =f+a. 


We shall continue the discussion on terms of the lower limit: the necessary change of 
words to apply it to the upper limit will be obvious. Now a simple firstsuggestion for finding 
a limit, without using a table, would be to find, by the usual method, immits 7) and 7, 
corresponding respectively to the observations, fin n and f+ 1 in n, and to make a random 
linear interpolation between these two values. ‘his is equivalent to replacing the arcs in the 
graph by straight lines joining their extremities. But it is immediately seen that the conse- 
quence of doing this is that the probability that 7 lies below the lower limit will be not less 
than that stated; though admittedly it will not be much more. Now, if we have to make an 
inequality statement, this must be in the opposite form, i.e. that the probability is not 
greater than that stated. Hence, this method cannot be considered satisfactory. 

We therefore turn to consider the possibility of interpolating between two consecutive 
integral values of y (y = f and y = f+1, i.e. x = 0 and 2 = 1) by means of a Taylor’s series. 
The value of dz/dzx (we will drop the suffix, )) is obtained from Stevens, equation (2-22); we 
should note, however, a mistake in equation (2-13)—the h, and h, should be interchanged. 


We find 
dr 1 


de (n—f)e , fl=2)' 


l-—7 1 








The second derivative is found in the usual way from 


i 0 (dx a) a dn 
Tt tala tale da)’ 
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the calculation being simplified by the fact that it is required only at the values x = 0 and To | 
x = 1. The results are ‘3 in | 
Hence 
| 
x | dn/da a? /dx* 
| 
| 2 | *) fat) 
| F 7 | J n 1-7 
1-7 | 4) fui 
| n—f n—-f} \n 1-7 
For x 
If x is less than 4, we extrapolate forward from the lower value of 7, using the formula m= 0 
Iti 
2) 2 
1 + (da/dx) x + 4(d?2/dx?) x poe 
and if greater, we extrapolate backwards from the higher value, using which 
nm — (da/dx) (1 —x) + 4(d?2/da*) (1—2)?. 
To illustrate the process, we will consider the lower 10 % limit for the result ‘3 in 10’. From 
Fisher & Yates or other table, we find the lower 10 % limits for this result and for the result iv 
‘4in 10’. We find 
x 7 dn/dx 4(d?71/da?) 
0 0-116 0-0387 0-0199 Spe 
1 0-188 0-116 0-0411 then - 
taken 
Fir 
The results of interpolation can now be compared with the tabulated values as follows: the p 
| x (a) Table (b) 
0-0 0-116 0-116 — 
0-1 0-120 0-120 —_ 
0-2 0-125 0-124 —— 
0-3 0-129 0-129 0-127 | 
0-4 0-135 0-135 0-133 | 
0-5 0-140 0-141 0-140 | 
0-6 0-146 0-148 0-148 | 
| 7 0-153 0-157 0-157 
| O8 ion 0-166 0-166 | itis 
09 — 0-176 0-177 | 
| 10 a 0-188 | 0-188 short 
L | | mine 
(a) Extrapolated forward from a = 0. (6) Extrapolated backward from x= 1 a 
other 
The errors of interpolation would seem to be too small to be revealed by this comparison. exist 











W. L. STEVENS 439 


To find the upper 10% limit, we find from the table the limits for the observed result 
‘3 in 10’ and for the result ‘2 in 10’. These correspond respectively to x = 1 and x = 0. 
Hence 











x - | daidx }(d2n7/dx*) 
= | 
| 
0 0-450 0-150 —0-0432 
1 0-552 0-064 —0-0255 

















For x = 0-5 both forward and backward extrapolation yield 7 = 0-514, while the table gives 
m = 0-512. The difference probably represents the error in the table. 

It is advisable to use the same value of x for both limits and indeed for all intervals, if 
more than one is determined. Thus suppose the table of random numbers yields 268..., 
which on rounding gives x = 0-27. The limits will be 


7) = 0-116 + (0-039) (0-27) + (0-020) (0-27)? = 0-128, 
7, = 0-450 + (0-150) (0-27) + (— 0-043) (0-27)? = 0-487 
and we can state that 
Pr(7 < 0-128) = 10% 
Pr(0-128 <7 < 0-487) = 80% 
Pr(0-487 <m) = 10%. 


Special treatment is needed for f = 0 and f = n. It is recalled that if f= 0 and x<1-—P, 
then 7 = 0. Hence, if such a value of x is drawn (e.g. x < 0-9, for a lower 10% limit) 7 is 
taken as zero. Otherwise, we extrapolate backwards from the limit corresponding to f = 1. 

Finally, we may note that essentially the same method may be used for setting limits to yw, 
the parameter of the Poisson distribution. The derivatives at x = 0 and 1 are: 





x du/dx | d?u/da® 


iy Gy 








3. CoNCLUSION 


It is open to question whether anything material is gained by devices for finding intervals 
shorter than those provided by the classical method. If, however, the statistician is deter- 
mined to have them, it would seem that, for both practical and theoretical reasons, the 
device of adding to the frequency a random number between zero and one is superior to any 
other. This paper has shown that these intervals can be found by a simple interpolation in 
existing tables. 
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UPPER PERCENTAGE POINTS OF THE GENERALIZED BETA 
DISTRIBUTION. II 


By F. G. FOSTER 
Research Techniques Unit, London Schooi of Economics 


1. INTRODUCTION 


These tables extend the tabulation of the 80, 85, 90, 95 and 99% points of I,(k; p,q) to 
the case k = 3. They are a continuation of the tables for k = 2 given in Foster & Rees (1957), 
which will be referred to as ‘I’. Reference is made to I for all definitions. 


2. INTERPOLATION 


Interpolation will be required for integer values of vy, = 2q+ 2. For v, greater than about 80, 
linear interpolation will give accuracy to one unit in the fourth decimal place; for smaller 
values of v,, 4-point interpolation to halves should be used. For interpolation between 
v, = 194 (q = 96) and vy, = 00, 3-point harmonic interpolation, based on q = 48, g = 96 
and q = 00, may be used, as indicated in I. 


3. USES OF THE TABLES 


Typical applications to tests of significance for the case k = 2 were given in I. A further 
example is given below to illustrate the present tables. 

Example. Table 1 gives the analysis of dispersion for the three characters, head length 
(x,), height (7) and weight (x), measured on 140 schoolboys of almost the same age, 
belonging to six different schools in an Indian city. This example is taken from Rao (1952, 
p. 263) who examines the data using the alternative A criterion of Pearson & Wilks. 


Table 1 





| Sums of products matrix 
| 











| D.F. | ' : 
| | at xs | x | Xe Wy Be Igy 
L es See) ee A = 
| | | | | | 
| Between schools | 5 | Qi 752-0 | 151-3 | 1,612-7 | 214-2 | 521-3 401-2 
| Within schools | 134 | W,,; | 12,809-3 | 1,499-6 | 21,009°6 | 1,003-7 | 2,671-2 4,123-6 
| | 
— —— —|——_ —— on =] ve ——————SSS a — 





| Sy :13,561-3 | 1,650-9 | 22,622-3 | 1,217-9 | 3,192°5 | 4,524-8 





Total 139 


| 








It is required to test whether there aie any significant differences in boys’ physique 
between schools. On the null hypothesis of no differences, the matrices (Q;;) and (W,;), 
divided by their degrees of freedom, are independent estimates of the same parent dis- 
persion matrix, and our test consists in computing the greatest latent root of |Q—6S| = 0, 
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and entering the table with v, = 134 and vy, = 5. We find that 0,,,, = 0-10055, which is 
only significant at the 15 % level. 


4, METHOD OF COMPUTATION 
The computations were carried out on the DEUCE Computer of the English Electric 
Company. Let 6,,,,, denote the greatest root of 
where A and B are independent estimates, based on v, and v, degrees of freedom, of a parent 
dispersion matrix of a trivariate Normal population. Define 

T,(3; p,q) = Pr {Omax.<}, 
where p = 4(v,—2), q = }(|v, -3|+1). Roy (1945) gives a formula which, in the present 
notation, may be written 
I,(3; p,q) 
= K'(2B,( p,q) B,(2p + 2, 2q) — 2B,(p + 1, q) B,(2p + 1, 2g) — 2?*(1 — 2)" B,(2; p, q)}, 
where B,(p,q) is the Incomplete Beta function; 
RK’ 1 p+q 1 T'(2p + 2q+ 3) 


~ 2p+q+1 Bip.) Ppt IT (2q+1)’ 
and B,(2; p,q) = I,(2; p, q)/K, as defined in I. As it stands, this formula is too ill-conditioned 
to be useful for computation. If, however, we substitute for B,(2; p,q) and distribute the 
normalizing constant, K’, we obtain 


[,(3; p,q) = I,(2p + 2, 2q) L,( p,q) + p/q(2p + 29 + 1) {1,(2p + 2, 29) L,( p, 9) 


. I,(2p +1, 2q) I,(p +1, q) + b,(2p, 2q) I,(p, q) e b,(p, q) I,(2p, 2q)}, 
where J,( p,q) is the Beta distribution, and 





P(p+q+1)_ 
pl(p+1)1(q)° 
The ranges of p and q were p = $(4)4, g = 1(1) 96. For the integral values of p, I,(p,¢) 
was computed recursively as in I, and b,( p,q) by means of the relations 


b,(p,q) = xP+1(1—x)4 


b,(p, 0) =0 (p> 0), 
b,(1, 1) = 2a2(1—2), 

b,(1,q+1) = 6,(1,9) (q+ 2)/¢ (q>09), 
b,(p,q) = (P+ IMP} _ 1 _2)b,(p,q-1)+2b,(p—1,9)} (p,q>0). 


(p+q)(p—1)+1 


For the half-integral values of p, [,(p,q) was computed recursively by means of 


(4, 1) = at, 
I,(p+1,1) = xI,(p, 1) (p>0), 
1,(4.q+1) = 3g elt 2)— Leb} + Ll 0) (q>0), 


I,( p,q) = (1-2) I,(p,q—1)+21,(p—1,9q) (p,q>9), 





and 


1) 
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and b,(p,q) was computed by means of 
b,(%, 1)= 3a8(1 —2), 


b,(3,9+1) = b,(4, 9) (1-2) (2¢ + 3)/2¢, 
b,(p, 0) = 0, 


together with the last formula for b,(p,q) given above. 

As projected in I, the percentage points were obtained directly without prior tabulation 
of the distribution function. The method used (which may be described in more detail 
elsewhere) was to compute a table of [,(3; p,q) for the whole range of p and qg for a sequence 
of values of x, starting at 0 and finishing at 1 at a fixed interval, h. The tables for the current 
value of x and the three preceding values, x—h, «-- zh, x—3h, were stored on the drum. 
On completion of each current table, a 4-point inverse interpolation was carried out for 
each of the five required percentage points on the four values in position (p,q) on the four 
tables for all positions (p,q) such that JI, .,(3; p,q) was less than, and I,_,(3; p,q) was 
greater than the required percentage value. The resulting value was punched out on a card 
together with the corresponding p and q and an indication of the percentage value. The 
values were in fact punched out twice on each card, once rounded to four decimal places and 
once to six decimal places. Differencing was carried out on the latter values as a check on 
accuracy. The cards were subsequently sorted and the rounded values tabulated in the 
required layout. An interval h = 2-* was found to give accuracy to at least four decimal 
places for most of the percentage points. Some of the 99% points near the upper limit, 
x = 1, had to be obtained using a smaller interval of 2-". 

The same method of computation could be used to obtain values for v, > 10, should the 
need for these be felt. At this stage, however, it was thought more useful to devote the 
computational effort to obtaining a fuller tabulation for 7. 


The author is indebted to the Director of Research of the English Electric Company for 
permission to use the DEUCE Computer of the London Computing Service for this work; 
and to the staff of the London Computing Service for much helpful advice and assistance. 
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| 
Ve 3 
; 
| Pe.” 
| P 
| 4 080 | 0-9516 
+85 9645 
| 90 | -9769 
| 95 | -9887 
99 | -9978 
| 6 080 | 08294 
‘85 -8561 
-90 +8857 
| ‘95 | -9218 
| 99 | +9664 
| 8 0-80 | 0-7180 
| 85 | -7497 
90 | -7871 
95 +8365 
| 99 | -9086 
| 10 0-80 | 0-6281 
85 | -6610 
90 | +7008 
| 95 -7560 
| 99 | -8439 
| 12 0-80 | 05564 
| +85 -5887 
90 | -6287 
95 | -6857 
-99 7816 
| 14 0-80 | 0-4986 
85 | +5297 
90 | -5687 
95 | +6254 
99 | +7245 
| 16 0-80 | 0-4512 
| ‘85 | +4809 
| 90 | 5185 
| ‘95 | -5739 
| 99 | -6735 
| 18 080 | 04118 
| ‘85 | -4401 
90 | -4760 
95 | -5296 
| -99 | -6282 
| 20 0-80 | 0-3787 
| ‘85 | +4054 
| 90 | -4397 
95 | -4914 
99 ‘5879 
| 22 0:80 | 0-3504 
| +85 +3758 
| 90 -4085 
| 95 | -4580 
99 | -5521 


0-9635 
‘9733 
9826 
9915 
-9983 

0-8629 
-8846 
-9087 
-9378 
‘9734 


0-7639 
-7911 
*8229 
*8646 
+9247 


0-6798 
-7090 
-7440 
*7922 
-8679 


0-6102 
-6396 
-6757 
-7266 
*8113 


0-5525 
-5813 
*6172 
-6689 
*7582 


0-5042 
-5321 
-5672 
-6185 
-7096 


0-4633 
-4902 
+5242 
‘5745 
-6657 

0-4284 
+4542 
-4870 
-5359 
-6262 

0-3983 
-4230 
“4545 
-5019 
-5906 


0-9707 
‘9786 
-9861 
-9932 
*9987 


0-8852 
-9036 
*9238 
*9482 
-9779 


0-7965 
+8203 
*8480 
+8842 
*9359 


0-7181 
+7443 
“T7757 
*8185 
+8852 


0-6513 
-6783 
*7112 
“7574 
+8334 


0-5946 
*6215 
*6548 
-7024 
+7837 


0-5464 
+5728 
-6057 
*6535 
-7376 


0-5050 
-5307 
+5629 
-6103 
-6953 


0-4692 
-4940 
+5254 
‘5719 
-6568 


0-4380 
-4619 
-4924 
*5377 
-6218 


0-9755 
-9821 
*9884 
*9943 
“9989 


0-9012 
-9171 
-9346 
-9556 
-9811 


0-8210 
*8421 
*8667 
-8986 
*9441 


0-7478 
-7716 
-8000 
*8386 
*8983 


0-6840 
-7089 
+7392 
*7815 
+8505 


0-6289 
*6541 
*6851 
“7292 
*8040 

0-5812 
-6062 
-6372 
-6820 
-7601 


0-5398 
+5643 
+5950 
-6398 
*7195 


0-5037 
-5276 
-5576 
-6019 
-6821 


0-4719 
*4951 
5244 
-5679 
-6478 


0-9790 
-9846 
-9900 
-9951 
-9990 


0-9132 
*9273 
*9427 
-9612 
-9835 


0-8400 
*8591 
*8812 
-9098 
*9504 


0-7717 
+7935 
*8195 
+8546 
-9086 


0-7108 
*7340 
-7620 
-8010 
*8643 

0-6574 
6811 
‘7101 
*7512 
-8206 

0-6106 
*6343 
-6637 
-7058 
*7788 


0-5696 
+5930 
-6222 
-6647 
-7398 


0-5334 
-5564 
-5852 
*6275 
-7035 


0-5014 
-5238 
-5521 
-5938 
-6700 


| 


| 


| 
| 
| 
| 
| 
| 
| 


0:9316 
-9865 
-9913 
+9957 
-9992 

0:9226 
-9352 
-9490 
9655 
+9853 

0-8554 
-8728 
-8929 
*9188 
“9554 

0-7913 
“8114 
*8353 
-8676 
-9170 

0-7332 
+7549 
-7810 
*8172 
“8757 

0-6816 
-7039 
*7813 
-7698 
*8344 

0-6359 
*6584 
-6863 
-7261 
*7947 

05954 
-6179 
+6457 
-6861 
*7571 

0-5595 
-5816 
-6093 
-6497 
*7220 

0-5274 
-5491 
+5764 
6165 
-6893 





0-9836 
“9880 
*9922 
-9962 
-9993 


0-9302 
-9416 
*9540 
-9689 
-9868 


0-8680 
*8840 
*9024 
-9261 
*9595 


0-8077 
+8264 
*8486 
+8784 
9239 


0-7523 
+7726 
*7971 
+8309 
+8853 


0-7025 
-7236 
-7494 
*7857 
*8462 


0-6579 
-6794 
-7058 
*7436 
-8083 


0-6181 
-6396 
-6663 
-7048 
*7721 


0-5825 
-6038 
6304 
-6692 
*7381 


0-5505 
*5715 
-5979 
6365 
-7063 








This table gives the values of x for which Pr(0,,,..<x) = I,(3; p, g) = P, where p = }(v_— 2), qg = $(¥, —2).. 





26 


28 


| 30 


This 





F. G. Foster 445 " 


Generalized Beta distribution (cont.) 


























ie Pyrere 
| "s * i @ 5 6 Pe rs 10 
"i | | | | | | 
—EE a — | Sheet = ap —_ = - _ aes 
P | | | | | | 
| 24 0-80 | 0-3259 | 0-3720 | 0-4106 | 0-4438 | 0-4728 | 0-4986 | 05217 | 0-5426 
| 85 | +3501 ‘3957 | -4337 | -4662 | -4947 | -5198 5423 | -5626 
| 90 | -3813 4260 | -4631 | -4947 | -5223 |  -5466 -5683 | -5878 
95 -4288 | -4718 | -5072 ‘5373 | -5633 | -5862 | -6066 | -6249 
-99 -5201 5586 | -5900 | -6164 | -6392 6590 | -6766 | -6923 
| 26 0-80 00-3046 | 03490 | 03864 04187 | 04473 | 0:4727 | 0-4957 | 0-5165 
85 | -3276 | -3716 | -4086 -4404 -4685 | -4934 5158 | -5361 
| 90 3574 4007 -4370 | -4681 -4954 5196 | -5413 | -5610 
95 -4030 -4450 ‘4798 | +5096 5356 | 5586 | “5792 | = -5977 
| -99 -4914 5296 | +5610 -5876 6107 | +6309 6490 | -6651 
28 0-80 | 0-2859  0:3286 | 0:3648 | 03963 | 0-4243 | 0-4493 | 0-4720 | 0-4926 
85 -3078 +3503 -3861 -4173 444g | -4694 | -4917 | -5119 
-90 -3363 | -3783 -4136 4441 -4710 | -4951 | -5167 | -5363 
95 | -3801 -4210 -4552 -4846 5104 | -5333 +5539 | -5726 
99 | -4656 -5033 +5345 5612 5844 | -6049 -6232 -6398 
| 30 0-80  0-2694  0-3104 | 03455 | 0-3761  0-4035  0-4280  0-4504 | 0-4708 
“85 -2903 -3312 | -3660 3964 | -4234 | -4476 | -4696 | -4897 
| 90 | -3175 3581 | -3925 4224 4489 | -4726 | -4941 | +5136 
95 | +3595 -3993 -4328 -4618 -4873 | -5101 5307 | +5493 
| 99 | -4423 -4794 ‘5103 | +5369 ‘5601 | +5808 | -5993 | -6160 
| 32 0-80 | 0-2546  0-2941 | 0-3280  0:3579 | 0-3845 | 0-4087 | 0-4307 | 0-4508 
85 | -2746 | -3141 -3479 | +3775 -4039 | -4277 | -4494 | -4693 
-90 | -3007 3400 | -3734 | -4027 | -4287 -4521 -4733 | +4927 
| 95 | -3411 3798 | -4125 | -4410 | -4661 | -4887 -5092 -5278 
| 99 | -4211 | -4575 | -4881 | ‘5144 | +5376 | +5583 | 5769 | 5938 
34 0-80 | 02414 | 02795 03123 | 03413 0-3673 | 0-3909 | 0-4125 | 0-4324 
85 | -2605 | +2986 +3314 -3602 8861 | -4095 | -4308 | -4504 
90 | -2855 -3236 -3561 -3847 -4102 4332 | -4542 | -4734 
95 | +3244 +3620 | +3940 -4219 -4467  --4690 | --4893 5079 
| 99 4018 +4375 | -4676 4937 | -5168 | -5374 | 5561 -5730 
36 0:80 | 0-2295 | 0-2662  0-2980 | 0-3261 | 0-3515 | 03746 | 03958 | 0-4154 
| 85 | -2478 | -2846 | -3164 3445 | = +3697 -3927 | -4137 -4330 
| 90 | -2718 | -3087 | -3403 -3682 | -3932 | -4158 -4364 4555 
95 | -3093 | -3458 | -3770 | -4043 -4287 | -4508 | -4709 -4893 
| 99 | -3842 | -4191 | 4488 | +4746 4974 | +5180 | +5366 -5536 
| 38 0-80 | 0-2187 | 0-2541 | 0-2849 | 0-3123 | 0-3370 | 0-3596 | 0-3804 | 0-3997 
| ‘85 | +2363 -2719 3027. +3300 | -3547 | -3772 | 3978 | +4169 
90 | .2504 | -2951 | -3258 | -3530 3775 | +3997 | -4200 | -4388 
| 95 | +2955 | -3309 | -3614 | -3882 -4122 | -4339 | -4537 -4720 
99 | -3680 | -4022 | -4313 | -4568 -4794 | -4998 | -5183 +5353 
40 0-80 | 0-2088 | 0-2430 | 0-2729  0-2995 | 0-3237 | 0-3458 | 0-3662 | 0-3851 
‘85 | +2258 | +2602 -2901 ‘3167 | +3408 | -3628 | -3831 | -4018 
-90 2480 | -2826 | -3125 | -3390  -3630 | -3848 | -4048 -4233 
‘95 | -2828 | -3173 | -3470 | -3732 | -3968 | -4182 | +4377 -4558 
‘99 | +3531 | -3865 | 4151 | -4402 -4626 4828 | -5012 5181 
42 0-80 | 0-1998 | 0-2329 | 0-2619 | 0-2878 | 0-3113 | 0-3329 | 0-3529 | 0-3715 
85 -2161 2495 | -2785 | -3045 -3280 3495 | +3694 -3878 
90 -2376 ‘2711 | -3002 | -3261 +3495 -3709 -3906 -4088 
95 -2712 ‘3048 | +3337 | +3594 +3825 -4035 -4228 -4407 
99 -3393 3720 -4001 | -4248 -4469 -4669 -4851 -5020 




















This table gives the values of x for which Pr(9,,,,. <”) = I,(3; p, g) = P, where p = 3(v.— 2), q = $(v,— 2). 





























446 Upper percentage points of the generalized Beta distribution. II 
Generalized Beta distribution (cont.) 
: ‘ 
he ie he | | | | | | | | Gi 
3 | | | ~“ 
rn fe Fla be ke Pe hie | 10 : 
| iy } | | | | ? 
a at We Soe gee m 
| 44 080 | 01916 0-2236 | 02517 0-2769 ~—0-2999 | 0-3210 | 0-3406 | 0-3588 64 
| 85 | +2073 -2396 —_-2678 -2931 -3161 | 3371 3566 +3748 | | 
| 90 | +2280 +2605 --2888 -3141 -3370 3580 +3773 3953 
95 2605 +2931 | = -3241 3465 -3692 | +3898 -4089 | +4265 
99 | +3266 +3586 — -3861 4104-4321 4519 | -4700 | -4867 | 
| 46 080 | 0-1840 | 0-2150  0-2423 | 0-2668 0-2892 | 0-3099 | 0-3290 | 0-3469 | 66 
85 | -1991 | +2304 | -2579 -2825 -3050 3256 +3447 | +3625 
| ‘90 | +2192 | +2507 | -2783 , -3030 | -3254 | 3459 | -3649 | -3826 
95 | +2506 , -2824 | -3099  -3345 | +3567 3770 | -3958 | -4132 
-99 ‘3147 | -3460 3730-3969 +4183 | -4379 4558 “4724 
| 48 0:80 01770 | 0:2070 | 0-2335 | 0-2574 0-2793 | 0-2995  0-3183 | 0-3358 | 68 
‘85 | -1916 | +2220 | -2487 :2727 2946 | -3148 | -3336 | 3511 
90 | +2110 2416 +2685 +2926 +3145 ‘3346-35383 +3707 
| 95 +2415 +2723 | +2993 3233 | +3451 3650 | +3835 | -4006 
99 | +3037 ‘3343 | -2608 3842 | -4054 4247 | +4424 | -4588 
| 50 0-80 01704 | 0-1996 | 0-2254  0-2487 | 0-2700 | 0-2898 | 0-3082 | 0-3254 70 
‘85 | +1846 | +2141 -2401 +2635 | +2849 +3047 ‘3231 | +3403 
‘90 | +2034 | +2332 | -2594 2829 | -3043 | +3241 ‘3424 +3595 
‘95 | +2329 | +2630 +2893 3128 | +3342 ‘3538 | +3719 | +3888 
‘99 | 2934 +3233 -3493 “3723-3932 4122 | 4297 |  -4460 
| 52 080 | 01644 | 0:1927 | 02178 | 0-2405 | 0-2613 | 0-2807 00-2987 | 0-3156 72 
“85 -1781 ‘2068 | -2321 | +2549 | -2759 2952 | +3133 | +3302 
| -90 -1963 +2253 | +2508 +2738 | +2948 3141 | +3321 +3490 
95 -2250 2543 | -2799 | +3030 -3239 +3432 | -3610 ‘3777 
| -99 2838 | -3131 +3385 +3612 ‘3817 | -4004 | -4177 | -4338 
| 54 0-80 | 01588 | 0-1863 | 0-2107  0-2328 | 0-2532 | 0-2721 | 0-2898 | 0-3064 74 
85 | +1721 1999 +2246 -2469 2674 2863 +3040 | = -3206 
90 -1897 2179 | +2428 | +2653 | +2858 3048 | +3225 | +3390 
| 95 2175 | +2461 | -2712 | -2937 | -3142 3332 | -3507 | -3671 
99 2748 | +3034 +3284 -3506 | +3708 3893 | -4064 | +4223 
| 56 0-80 1535 | +1802 -2040 +2256 +2456 2641 | +2814 | -2976 76 
| “85 1664 | +1935 | -2176 +2394 2594 | +2779 | -2953 | -3116 
| -90 +1835 2110 | +2353 ‘2572-2774 | +2960 +3133 | +3296 
| 95 | +2106 2385 | +2630 | +2850 | -3051 3237 | -3410 | +3572 
99 | +2664 2944 | +3188 | +3407 |  -3605 ‘3787 | +3956 | -4113 
| 58 -80 | 0-1486  0-1746 | 0-1978 | 0-2189 | 0-2383 | 0-2565 | 0-2734 | 0-2804 7m 
85 | -1611 | -1875 | -2110 +2322 | +2518 -2700 +2870 | —-3030 
| -90 ‘1777 | +2045 | = -2282 +2497 -2694 | +2876 -3047 | +3207 
95 2041 | +2313 | -2552 2768 | -2966 | -3148 | +3318 | -3477 | 
| -99 ‘2584 | +2858 | -3098 +3312 | -3508 ‘3687-3854 | © -4009 | 
| 60 0-80 | 01440 | 01693 | 01919 | 0-2125 | 0-2316 | 0-2493 | 0-2659 | 0-2816 | 8 
| “85 -1561 1819 | +2047 +2255 ‘2447 | +2625 | +2792 | -2950 
-90 -1723 -1984 2216 | +2496 | -2619 | -2798 | -2965 | -3122 
| ‘95 | +1979 | +2245 | -2479 | -2691 | -2884 | -3064 -3231 | +3387 
| 99 | +2509 | +2777 ‘8012 | +3223 | +3415 | -3592 +3757 -3910 
62 080 | 01396 | 0-1643 | 0-1864 0-2065 | 0-2251 | 0-2425 | 0-2588 | 0-2742 83 
‘85 | +1515 | +1765 | -1989 | 12192 | 2380 | +2655 -2718 -2873 
90 | +1672 1927 | +2153 | -2358 +2547 2723 -2887 -3042 
95 | +1921 | +2181 | +2410 |  -2617 | +2807 | +2983 -3148 -3302 
-99 | 2438-2701 | +2032 «3139 | 3328 | -3502 -3664 -3816 
| | 
This table gives the values of x for which Pr(@,,,,, <x) = I,(3; p, q) = P, where p = }(v,— 2), g = $(v, — 2). This 1 
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F. G. Foster 447 
Generalized Beta distribution (cont.) 
gs | | 
idee 3 4 Ss fe re# 8 9 10 
1 7 | 
es Bee pec eees “Werbri eal : | ead 
P | | | 
— 64 «(0-80 | 0-1355 01596-01811 | 0-2008 | 0-2191 00-2361 | 0-2521 | 0-2672 
“85 ‘1471 ‘1715 1933 | +2132 +2316 +2488 +2648 +2800 
90 | -1624 | +1873 | +2094 +2295 -2480 +2652 2814 | +2966 
“95 1867 | -2120 | -2345 | -2548 2734 | -2907 | -3069 | -3221 
99 | +2371 | +2629 | +2855 | +3058 3245 3416 | +3576 | +3726 
66 0-80 | 0-1317 | 0-1552 | 0-1762 | 0-1955 | 0-2133 | 0-2300 | 0-2457 | 0-2605 
85 | -1499 | - re” 4 — -22 é ie. i «@ 
eR ie ie oe ee 
95 | -1815 | +2063 | 2283 | +2482 +2665 2835 | +2904 | -3144 
99 | +2308 | +2560 | 2782 | -2982 3165 -3335 | -3492 | -3640 
68 0-80 | 0:1280 | 0-1509 | 0-1715 | 0-1904  0-2078 | 0-2242 | 0-2396 | 0-2542 
“85 -1390 | -1623 | -1831 | +2022 2198 | -2363 | -2518 | -2665 
90 | 1535 | +1773 | -1984 | +2177 2355 | -2521 | 2677 | +2824 
95 | +1767 | .2009 | +2224 | 2419 | +2599 | -2766 | -2923 | +3070 
99 | +2248 2495 | +2713 | -2910 | -3090 -3256 | +3412 +3558 
70 0-80 | 0-1246 | 0-1470 | 0-1671 | 0-1855 | 0-2026 | 0-2187 | 0-2338  0-2482 
‘85 | +1353 | -1580 | -1784 1971 | +2144 +2306 +2458 +2602 
-90 1495 | -1727 | -1934 | -2123 | -2297 | -2461 2614 | +2758 
‘95 | +1720 | -1958 | -2168 | -2360 | -2536 | -2701 2855 | -3000 
-99 | -2191 | -2433 | -2647 | -2840 | -3018 | -3182 | -3335 | -3479 
72 0-80 | 0-1213 | 0-1432 | 0-1629 | 0-1809 | 0-1977 | 0-2134 | 0-2283 | 0-2424 
85 | +1317 1540 | +1740 | -1922 | -2092 2251 | +2400 | -2542 
90 | +1456 1683 | -1886 | -2071 | -2242 2403 | +2553 +2696 
-95 | ‘1677 | +1909 | -2115 2303 | -2477 | -2638 -2790 | +2932 
‘99 | -2136 | -2374 | -2584 | +2774 2949 | -3111 | -3262 | -3404 
74 0-80 | 0-1182 | 0-1396 | 0-1588 | 0-1765 | 0-1930 | 0-2084 | 0-2230 | 0-2369 | 
85 1284 | -1502 | -1697 | -1876 +2043 2199 | +2346 +2485 | 
-90 | +1419 1641 | -1840 -2022 | +2190 -2347 | +2495 +2635 | 
‘95 | +1635 | -1862 2065 | -2249 | -2419 +2578 | +2727 +2868 | 
99 | +2085 +2318 2524 | -2711 | +2883 -3043 +3192 +3332 
| | | | 
76 0-80 | O-1153 | 0-1362 | 0-1550 | 0-1724 | 01885 | 0-2037 | 0-2180 | 0-2316 
‘85 | -1252 | +1465 | -1657 -1832 1995 | -2149 +2293 -2430 
90 | +1384 | -1602 | 1797 | +1975 -2140 | 2294 | +2440 | +2578 
95 | -1595 1818 | -2017 +2198 | +2365 | 2521 | +2668 | °2807 
-99 +2036 +2264 ‘2467 | +2651 2820 | +2977 | +3124 -3263 
78 0-80 | 0-1125 | 0-1329 | 01514 | 0-1684  0-1842 | 0-1991  0-2132  0-2266 
| . . | . | e x | € 
‘85 | +1222 1430 1618 1790 1950 | -2101 | +2243 | -2378 | 
90 | +1351 | +1564 +1755 -1930 2092 | +2244 ‘2387 | +2523 | 
| ‘95 | +1557 | -1776 ‘1971 | +2148 -2313 | +2466 | -2611 | -2748 | 
99 | -1989 | -2213 | -2413 | -2593 2760 | -2915 | -3060 | -3196 | 
80 0-80 | 0:1098 | 0-1298 | 01479 | 01646 | 01801  0-1948 | 0-2086 | 0-2218 | 
85 | +1193 | +1397 1581 | +1750 -1907 2055 | +2195 | +2327 | 
90 | +1320 1528-1715 ‘1887 | +2046 2196 | +2337 2470 | 
95 | +1521 +1735 -1927 2101 | +2263 +2414 2556 | +2691 
99 | +1944 2164 | +2360 | -2538 2702 2855 | -2998 | -3133 
82 0:80 | 01073 | 01269 | 01446 | 0-1610 | 0-1762 | 0-1906 | 0-2042 | 0-2172 
‘85 1165 | +1366 | +1546 | 1712 | +1866 -2011 +2149 +2279 | 
-90 1289 | +1494 1677 | +1846 2002 -2149 2288 | +2420 
95 1487 | -1697 +1884 2056 | -2215 +2364 +2504 | +2637 
-99 -1901 | -2118 +2310 | +2485 -2647 | +2797 -2938 | +3071 
| | | 




















This table gives the values of x for which Pr (0 
29 


Biom. 44 


max, <<) = 1,(3; p,q) = P, where p = }(v,— 2), g = 3(¥1— 2). 

















448 Upper percentage points of the generalized Beta distribution. II 
Generalized Beta distribution (cont.) 
| 
"% | 3 4 5 . 4 3 8 . || * 
Vy | 
2 ea | | = 
P | | 
84 0-80 | 01048 | 0-1240 | 0-1414 | 0-1575 | 0-1725 | 0-1866 | 0-2000 | 0-2127 
85 | +1139 | -1385 | -1512 | -1675 1827 1969-2105 «| +2233 
‘90 | -1260 | -1461 | +1641 | +1807 1960-2105 |= -2241 +2371 
95 | 1454 | +1660 | 1844 -2013 2169 | -2316 | -2454 | -2584 
99 | +1860 | -2073 | -2262 | -2434 2504 | -2742 | -2881 | -3013 
| 86 0-80 | 01025 | 0-1213 | 0-1384 | 0-1541 | 0-1689 | 0-1828 | 0-1959 | 0-2085 
85 | -1114 | -1306 | -1480 | -1640 1789 | -1929 | -2062 | -2189 
| 90 | +1233 | -1429 | -1606 = -1769 1920 | +2062 | -2197 +2324 
| 95 | +1422 | -1625 +1806 = --1971 2125 | +2269 = -2405 || -2534 
99 | +1821 | -2030 | +2216 = -2386 2542 | -2689 | +2826 | -2956 
| 88 0:80 | 01003 | 01188 | 01355 | 01510 | 0-1654 | 0-1791 | 0-1920 | 0-2044 
| 85 | -1090 | -1279 | -1449 1606 ‘1753 | -1891 | = -2021 2146 
90 | -1206 | -1399 | -1573 | -1733 | -1881 | ‘2021 | -2154 | -2279 
| 95 | +1392 | -1591 | -1769 | +1932 | -2083 | +2225 | -2359 | -2486 
| 99 | -1783 | -1989 | -2172 | +2339 | -2493 | -2637 2773 | +2901 
| 90 0-80 | 0-0981  0-1163 — 0-1327 | 0-1479 01621 01755 | 01883 0-2004 | 
| 85 | +1067 1252 | +1419 | +1574 | +1718 1854 | +1982 | +2105 
‘90 | -1181 | -1371 1541 | +1698 1844, 1982 | -2112 | -2236 
95 | +1363, | +1558 | -1733 | -1893 | -2042 | -2182 | -2314 | -2440 
99 | 1747 | +1949 | 2129 | +2204 +2446 | +2588 | -2722 | -2848 
| 92 0-80 | 0-0961 0-139 | 01300 | 0-1450 | 01589 | 0-1721 | 0-1847 01967 
| 85 | +1045 | +1227 -1391 1542 | -1684 | ‘1818 | +1945  —--2066 
90 +1157 1343-1510 “1665 1809 | +1944 | -2072 | -2195 
| 95 | +1336 | +1527 | 1609 +1857 | +2003 2141 | «+2271 +2395 
99 | 1712 | -1911 2088 | +2250 | -2400 | -2541 | +2673 |  -2798 
94 0-80 | 0-0942  O-1116 | 0-1275 | 01421 | 0-1559 | 01689 | 0-1812 | 0-1930 
85 | 1024 | +1202 | 1363 | +1513 | 1652 | +1784 | -1908 | -2028 
90 | +1134 | -1316 | +1481 -1633 1774 | -1908 | +2034 | -2164 
| 95 | -1309 | -1497 | +1666 | -1821 1966 | -2101 | -2230 | -2362 
99 | +1679 -1874 | +2049 = --2209 2356 | 2495 | -2625 —--2748 | 
| 96 0-80 | 00923 | 01094 | 0-1250 | 01394 | 0-1529 | 0-1657 | 0-1779 | 0-1895 | 
| 85 | +1003 | -1179 | 1337 +1484 1621 | -1750 1873-1991 
| 90 | 1111 | 1291 14531602 | +1741 ‘1873-1997 2116 | 
| 95 | +1283 | -1468 1635 -1787 | +1930 | ‘2063 | -2190 -2310 
| 99 | +1647 1839 | -2011 2168 2314 | +2451 2579-2701 
98 0-80 | 0-0905 | 0-1073 | 0-1226 01368 | 0-1501 | 01627 | 0-1747  0-1861 
85 0984 -1156 | -1312 +1456  -1591 ‘1719-1840 «+1955 
| 90 1090 | +1266 | +1425 +1572 1709 “1839-1962 | -2079 
| 95 | +1259 | +1441 1604 +1755 | + +1895 | +2026 | +2151 +2270 
| 99 | -1616 | -1805 | -1975 | +2130 | -2273 | +2408 | -2535 | -2655 
100 0-80 | 0-0887 | 0-1053 | 01203 | 0-1343 | 0-1474 | 0-1598 | 0-1716 | 0-1828 
35 | -0965  -1134 | -1288 | -1429 1562 ‘1688 | -1807 | -1921 
“| +1069 +1242 +1399 = 1543 | «+1679 = -1806 | +1927, +2048 
| ; 1235 -1414 = 1575 | = +1723 1861 -1991 2114, +2231 
| 29 | +1586 | +1772 1939 | = -2092 2234 | -2367 2492 | -2611 
102 0:80 | 0-0871  0-1033 | O-1181 | 0-1318 | 0-1447 | 0-1569 | 0-1686 — 0-1797 
85 | -0947 ‘1113-1264 1404 1534 | +1658 | -1"76 | +1888 
| 90 +1049 | +1219 | -1374 1516 1649 | -1775 | -i894 | 2008 
| 95 +1212 | -1388 1547 1692 1829-1957 2078 +2193 
99 | +1557 ‘1741 1905 2056 2196 +2327 +2451 +2568 














This table gives the values of x for which Pr(,,,, <«) = I,(3; p,q) = P, where p = }(¥,—2),q = (— 2). 
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F. G. Foster 449 
Generalized Beta distribution (cont.) 
| | | | | | 
| | 
‘3 "PY @ Ff @ . b & & F 8s | 9 10 
| | | | | 
| Ly.” | a eRe Se: 
P | | | | 
104 0-80 | 0-0855 | 0-1014 | 0-1160 | 0-1295 | 01422 | 0-1542 | 0-1657  0-1766 
‘85-0930 1093-1241 +1379 1508 | -1630 -1746 -1856 
‘90 | +1030 | -1197 | -1349 1489 | +1620 | -1744 | +1862 “1974 
| ‘95 | +1190 | -1363 | +1519 1663 | -1797 | -1923 | +2043 +2157 
99 | 1529 | 1710 | -1872 +2021 +2159 2288 | -2411 +2527 
| 106 0-80 | 09-0839 | 0-0996 | 01139 | 01272 | 01397 | 0-1516 | 0-1629 | 0-1737 
| 85 | -0913 | -1074 | -1220 | -1355 | -1482 | -1602 | -1716 | -1826 
| ‘90 ‘1011 1176 -1326 1464 | -1593 | +1715 | -1831 1942 
95 -1169 1340-1493 | +1635 1767 | +1891 | -2009 +2122 
| ‘99 -1503 1681 | -1841 -1987 2123 +2251 +2372 +2487 
108 0-80 | 0-0824 | 0-0979 | 01120 | 0-1251 | 0-1374 | 01490 | 0-1602 | 0-1708 
85 | -0897 | +1055 -1198 1332 | +1457 | -1575 -1688 | -1796 
‘90 | -0993 | -1156 | -1303 | -1439 | -1566 | -1686 | -1801 | -1910 
‘95 +1149 1316 | -1468 | +1607 | +1737 | -1860 | +1977 | -2088 
99 | +1477 | +1652 | -1810 | 1954 -2089 | +2215 +2334 | +2448 
110 0-80 | 0-0810 | 00962 | 0-1100 | 01229 | 0-1351 | 01466 | 0-1576 | 0-1681 
85 0881-1037 | -1178 | -1309 1433 | +1549 | -1661 | -1767 
‘90 | -0976 | -1136 | -1281 | -1415 1540-1659 | -1772 | -1880 
| ‘95 | -1129 | -1294 1443 | +1581 -1709 | +1830 1945 | +2055 
| ‘99 +1452 1625 ‘1780 | +1923 | -2056 | -2180 +2298 +2410 
| 112 0-80 | 0-0796 | 0:0945 | 0-1082 | 01209 | 0-1329 | 01442 | 0-1550  0-1654 
‘85 0866 | -1019 | -1158 | -1288  -1409 +1524 -1634 -1739 
| 90 | -0960 | -1117 | -1259 ‘1391 ‘1515 | -1632 +1744 -1850 
| ‘95 | -1110 | -1273 | +1419 1555 -1682 | -1801 | -1915 +2023 
| -99 1428 | +1598 | -1752 +1892  -2023 +2146 +2263 +2373 
| 114 0-80 | 00782 | 0-0930 | 0-1064 | 0-1189 | 0-1307 | 0-1419 | 0-1526 | 0-1628 
| ‘85 | -0851 | +1002 | +1139 | +1267 "1387 | +1500 | -1608 -1712 
-90 | 0944 | -1098 +1239 +1369 1491 | -1606 -1716 -1822 
| ‘95 | +1091 | -1252 -1396 -1530 +1655 ‘1773 | +1885 -1992 
| 99 | +1405 | +1573 +1724 -1863 1992 | -2113 | -2228 | -2338 
| 116 0-80 | 0-0769 | 0-0914 | 0-1047 | 0-1170 | 0-1286 | 0-1397 | 0-1502 | 0-1603 
| ‘85 | -0837 | -0986 ‘1121 +1247 ‘1365 | +1477 -1583 | -1686 
: 90 | -0928 | -1080 “1219 +1347 1468 | +1581 | 1690 | -1794 
| 95 | +1074 1231 +1374 -1506 1629 | -1746 | +1856 | -1962 
| 99 | -1382 | +1548 | -1697 -1834 -1961 2082 | +2195 | -2303 
118 0-80 | 0-0757 | 00900 | 0-1030 | 01152 | 0-1266 | 01375 | 0-1479 | 0-1579 
‘85 0824 | -0970 -1103 +1227 1343 | +1454 | -1559 -1660 
| 90 | -0913 | +1063 | +1200 | -1326 1445 | -1557 | +1664 | -1767 
| 95 | +1056 | -1212 | -1353 | -1483 -1604 ‘1719 | -1829 | -1933 
| 99 1360 | +1524 | -1671 ‘1806 | -1932 2051 | +2163 | -2270 
| 120 0-80 | 0-0744 | 0-0885 01014 | 0-1134 | 01247 | 0-1354 | 0-1457 — 0-1555 | 
85 | -0810 | -0954 -1086 -1208 -1323 | +1432 -1536 1636 | 
-90 0898 | -1046 ‘1181 | +1306 1423 | +1534 | -1640 ‘1741 
| 95 1040 | -1193 -1332 | +1460 -1580 | -1694 | -1802 -1905 
| ‘99 | +1339 | +1500 1645 | +1779 1903 | +2021 | +2132 +2237 
122 0-80 | 0-0733 | 0-0871 | 0-0998 , 0-1116 | 0-1228 | 0-1334 | 0-1435  0-1532 | 
85 | -0798 | -0940 1069 | --1189 -1303 1410 | +1513 1612 | 
90 | -0884 | -1030 ‘1163 | +1286 -1401 “1511 “1615 ‘1715 | 
95 | -1023 | -1175 ‘1311 -1438 “1557 -1669 +1775 1877 | 
99 | +1319 | +1478 1621 -1753 -1876 -1991 -2101 -2206 | 
| | 


This table gives the values of x for which Pr(# 
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<x) = 1,(3; p,q) = P, where p = 4(v.— 2), ¢g = $(v, — 2). 
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Upper percentage points of the generalized Beta distribution. II 


Generalized Beta distribution (cont.) 

















N l l 
| | | 
| 
a : 
| ‘! Py 3 4 5 6 7 8 9 | 10 
| * | 
| ie | | 
| 124 ©80 | 0-0721 | 0-0858 | 0-0983  0-1100 | 01209 | 0-1314 | 0-:1414 | 01510 
“85 0785 | -0925 1053 1172 | -1283 | -1390 | +1491 | -1588 
90 0871 -1014 1145 1267 | -1381 -1489 | -1592 -1691 
95 -1008 | +1157 -1292 1417 1534 1644 | -1750 +1851 
99 +1299 | +1456 1597 1727 -1849 1963 | -2072 | +2175 
| 126 0-80 | 0-0710 | 0-0845 | 0-0968 | 01083 | 0-1192 | 0:1295 | 0-1394 | 0-1488 
| 85 0773 0911 -1037 “1154 | +1265 | +1369 | -1470 | +1566 
| -90 0857 -0999 -1128 1248 | -1361 | -1467 | -1569 | -1667 
| “95 0992 +1139 +1273 1396 | -1512 | -1621 | -1725 | +1825 
| -99 1279 1434 1574 1702 | -1822 | -1936 | +2043 | +2145 
| 128 0-80 | 0-:0699 | 0-0832 | 0-0954 | 0-1067 | 0-1174 | 0-1276 | 01374 | 0-1467 
| “85 0761 -0898 1022 1137 | -1246 | +1350 | -1449 | -1544 
| -90 0844 | -0984 “1112 1230 -1341 | +1446 | +1547 | +1644 
| “95 0978 1123 +1254 1376 +1490 | -1598 ‘1701 +1799 
-99 -1261 1413 1551 +1678 | -1797 | 1909-2015 | +2116 
| 130 0-80 | 0-:0689  0-0820 | 0:0940 | 0-1052 | 0-1158 | 0-1258 0-1354 | 0-1447 
| “85 0750 0884 1007 1121 | +1229 | -1331 | -1429 | -1522 
| 90 | -0832 0970 1096 | -1212 | -1322 | -1426 | -1526 | +1621 
| 95 0963 +1106 1236 | +1356 | +1469 | 1576 1678 | +1775 
| -99 1242 +1393 +1529 | 1655 | +1772 | +1883 | +1988 | +2088 | 
132 0-80  0-0679 | 0-0808 | 0:0926 | 0-1037 | 01141 | 0-1241 | 0-1336 | 0-1427 
“85 -0739 | +0872 0992 | 1105 | 1211 1312 | +1409 | +1502 
| -90 -0820 -0956 1080 | +1195 | -1304 | -1406 1505 | +1599 
| 95 0949 1091 1219 | +1337 | -1449 | +1554 | = -1655 +1751 
-99 -1225 | +1374 1508 | +1632 | -1748 | -1857 | 1961 | -2060 
134 0-80 | 0-0669 | 0-0797  0-0913 | 0-1022 | 0-1125 | 0-1223 | 0-1317 | 0-1408 | 
“85 0728 0859 0978 | -1090 | +1195 | +1294 |} +1390 | +1481 | 
‘90 | -0808 0942 1065 | +1179 | 1286 | -1387 | +1484 | -1578 | 
“95 0936 +1075 -1202 | -1319 | +1429 +1533 | +1633 | -1728 | 
99 1207 +1355 -1488 | -1610 1725 +1833 | +1936 | 2034 | 
| 136 0-80 | 0-:0660 | 0-0785 06-0901 | 01008  0-1110 | 0-1207 | 0-1300 | 0-1389 | 
85 | -0718 0847 0965 | -1075 ‘1178 | = +1277 | -1371 | +1462 
| 90 | -0797 | -0929 1050 | +1162 | +1268 | «+1369 | +1465 | +1557 
| ‘95 | -0923 -1060 ‘1185 | +1301 | +1410 | +1513 ‘1611 | +1705 
| -99 -1191 1336 -1467 1589 | 1702 | +1809 | ‘1910 | +2007 | 
138 0-80 | 0-:0650 | 0-0774 | 0-0888 | 0-:0994  0-1095 | 0-1191 | 0-1282 | 0-1371 | 
“85 -0708 0835 0952 -1060 | -1162 | -1260 | -1353 | -1442 
90 | -0785 0916 1036 1147 | +1251 | +1350 | +1445 | +1536 
‘95 | -0910 1046 ‘1169 | +1284 | +1391 1493 | +1590 | +1683 
-99 1174 -1318 ‘1448 | -1568 | -1680 -1785 | +1886 | +1982 
140 0:80 | 0-0641 | 0-0764 | 0-0876 | 0-0981 | 0-:1080 | 0-1175 | 0-1266 | 01353 
85 | -0698 0824 0939 1046 +1147 1243 +1335 | -1424 | 
-90 0775 0904 1022 1131 12385 | +1333 | +1427 | “1517 | 
95 0897 1032 +1153 1267 -1373 | -1473 | +1569 |  -1662 | 
-99 1159 +1301 1429 1547 1658 +1763 “1862 | 1957 | 
| 142 0-80  0-0632  0-0753 00864  0-0968  0-1066 | 0-1160 | 0-1249 | 0-1336 | 
“85 0689 0813 0926 | -1032 -1132 +1227 -1318 | +1406 | 
| -90 0764 | -0892 1008 | -1116 | +1218 | -1315 | -1408 | -1497 | 
“95 0885 -1018 -1138 1250 | -1355 | -1455 | -1550 1641 
-99 1143 1283 -1410 | -1527 | -1637 | -1740 | 1839 | 


1933 | 








This table gives the values of x for which Pr(0,,,,. <2) = 


1,(3; p,q) = P, where p = }(¥,—2), 4 = 41-2). 
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F. G. Foster 451 
Generalized Beta distribution (cont.) 
eke PERE RES 
Ve 3 4 x oS 6 7 8 9 10 
. | | | | | | | 
_———$} + ——}- _____}+-$_ —4-_—4 
Pe | | 
144 0-80 | 0-0624 | 0-:0743 | 0:0853  0-0955 | 0-1052 | 0-1145 | 0-1233 | 0-1319 
| “85 ‘0679 | -0802 0914 -1018 “1117 -1211 | -1301 | +1388 
‘90 | +0754 | -0880 -0995 1102 -1203 -1299 -1390 | -1478 
‘95 | -0874 -1004 -1123 1234 -1338 +1436 -1530 | +1620 
99 | +1128 1267 -1392 | -1508 | -1616 -1719 | -1816 | -1909 
146 0-80 | 0:0616 | 0-0734 | 0-0842  0-0943 | 0-1039 | 0-1130 | 0-1218 | 0-1302 
85 | -0670 | -0791 -0902 -1005 -1103 -1196 “1285 | +1371 
-90 0744 -0868 -0982 1088 -1187 -1282 -1373 | 1460 | 
95 -0862 -0991 -1109 -1218 -1321 -1418 ‘1511 | +1600 | 
| -99 ‘1114 | -1251 -1374 | +1489 -1596 -1697 -1794 | +1886 | 
| 148 0-80 | 0-0608 | 0-0724  0-0831 | 0-0931 | 0-1026  0-1116 | 0-1203 | 0-1286 
| 85 0662 | -0781 | -0890 | -0992 -1089 -1181 “1269 | +1354 
‘90 | -0734 | -0857 | -0969 | -1074 -1173 -1266 1356 +1442 | 
‘95 | -0851 | -0979 | -1095 | -1203 +1304 | -1401 -1493 -1581 | 
-99 1099 | -1235 | +1357 | -1470 +1577 -1677 | +1772 1864 | 
| 150 0-80 | 0-0600 | 0-0715 | 0-0820 | 0-0919  0-1013 | 0-1102 | 0-1188 | 0-1270 | 
| 85 | -0653 | -0771 | -0879 | -0980 -1075 | -1166 | -1253 | -1337 | 
| ‘90 | -0725 | -0846 | -0957 | -1060 | -1158 | -1251 | 1339 | +1425 | 
| 95 | -0840  -0966 ‘1081 | -1188 | -1288 | -1383 -1474 | +1562 | 
| ‘99 | +1085 | -1219 -1340 1452 | +1557 -1657 | ‘1751 | +1842 
| 152 0-80 | 0-0592 | 0:0706 | 0-0810  0-0908  0:1000  0-1088  O-1173 | 01255 | 
“85 0645 | -0761 -0868 -0968 -1062 +1152 | +1238 1321 | 
‘90 | -0716 -0836 0945 -1047 -1144 1235 | 1323 | +1408 | 
95 0829 | -0954 1068 | -1173 -1273 1367 | 1457 | +1543 | 
-99 1072 | -1204 | -1324 | -1435 | -1539 | -1637 | ‘1731 | +1820 | 
| 154 0-80 | 0-0584 | 0-0697 | 0-0800 | 0-0896 | 0-0988 | 01075 | 0-1159 | 0-1240 | 
| ‘85 | -0637 0752 | -0857 | -0956 | -1049 | +1138 | .1223 | +1305 
| 90 | -0707 | -0825 | -0934 | -1035 | -1130 -1221 | -1308 | -1391 
| ‘95 | -0819 0942 1055 | +1159 | +1257 -1350 | +1440 *1525 
‘99 | -1059 1190 1308 | -1418 +1521 1618 | +1710 -1799 | 
| 156 0-80 | 00577 | 00688 | 0-0790 | 0-0886 | 00976 | 0-1062 | 0-1145 | 0-1225 | 
| “85 | 0629 | 0743 0847 | 0944 | +1037 +1125 -1209 -1290 
| ‘90 | -0698 | -0815 | -0922 +1022 = -1116 -1206 | -1292 | -1375 
| ‘95 | -0809 | -0931 | -1042 | -1145 | -1242 +1335 +1423 -1507 | 
| ‘99 | +1046 | +1175 | -1293 | +1401 | +1503 -1599 | +1691 | 1779 | 
| 158 0-80 | 0-0570 | 0-0680 | 0-0781 | 0-0875 | 0-0964 | 01050 | 0-1132 | 01211 
85 | -0621 | -0734 ‘0837 | -0933 +1024 ‘1111 ‘1195 | +1275 
90 | -0689 | -0805 0911 | -1010 | -1103 ‘1192 | +1277 | +1359 
95 0799 | -0920 | -1029 | -1132 -1228 | -1319 | +1406 | -1490 
99 ‘1033 | +1161 | 1277 1385 | +1485 | 1581 | -1672 | :1759 
160 0-80 | 0-0563 | 0-0672 | 0-0771 | 0-0865 | 0-0953 , 0-1038 | 0-1119 | 0-1197 
85 -0613 0725 -0827 0922 | -1012 | +1098 | -1181 | -1260 
-90 -0681 -0796 -0900 0998  -1090 | -1178 | +1262 | "1343 
95 | -0789 | -0909 | -1017 1118 | -1213 | -1304 | -1390 | 1473 
| -99 -1021 1148 | -1262 1369 | -1468 | -1563 -1653 “1739 
162 0-80 | 0-0556 | 0-0664 | 0-0762 | 0-0854 | 0-0942 | 0-1026 | 0-1106 | 0-1184 
| “85 0606 -0716 -0817 0911 -1000 -1086 +1167 +1246 
-90 -0673 -0786 -0890 -0986 -1078 | +1165 +1248 +1328 
95 -0780 -0898 1005 -1105 -1200 -1289 -1375 *1457 
-99 -1009 1134 +1248 1353 +1452 | “1545 +1634 -1720 
| 


























This table gives the values of x for which Pr (0,,,, <) = I,(3; p,q) = P, where p = }(¥_— 2), 9 = (1-2). 











452 Upper percentage points of the generalized Beta distribution. II 


Generalized Beta distribution (cont.) 




















’% | 38 — a 5 2 s | 9 
Y; | | 
eS ae ieee. Ween ee © eS 
| Y | | | | 
164 0:80 | 0-0550 | 0-0656 | 0-0753 | 0-0845 | 0-0931 | 0-1014 | 0-1093 | 
“85 0599 | -0708 | -0807 -0901 -0989 1073 | +1154 | 
-90 0665 | -0777 | -0879 0975 +1065 “1151 | -1234 | 
95 | -0771 | -0888 | -0994 -1093 1186 | +1275 | -1359 
99 | -0997 | -1121 | -1234 | -1338 | -1436 | -1528 | -1616 | 
166 0-80 | 0-0543 | 0-0648 | 0-0744 | 0-0835 | 0-0921 | 0-1002 | 01081 | 
“85 ‘0592 | -0699 | -0798 0890 | -0978 | +1061 | -1141 | 
-90 -0657 0768 | -0869 0964 | -1053 | -1139 -1220 
95 | -0762 | -0877 | -0982 -1080 | -1173 | -1260 +1344 
99 | -0986 | -1109 | -1220 +1323 1420-1511 +1599 
168 0-80 | 0-0537 | 0-0641 | 00736 | 0-0825 | 0-0910 | 0-0991 | 0-1069 | 
‘85 | -0585 | -0691 | -0789 | -0880 -0967 -1049 “1129 | 
90 | -0649 | -0759 0859 | -0953 | -1042 | -1126 1207 | 
‘95 | -0753 | 0867 | -0971 1068 | -1160 | -1246 -1330 | 
-99 0975 | -1096 | -1206 | 1308 | 1404-1495 | +1582 | 
170 0:80  0-0531 | 0-0633 | 0-0728 | 0-0816 | 0-0900 | 0-0980 | 0-1057 | 
“85 0578 | -0684 | -0780 | -0870 | -0956 1038 | +1116 | 
-90 0642 | -0751 -0850 -0942 -1030 ‘1114 | +1194 
‘95 | -0745 | -0858 | -0961 | -1056 | -1147 1233 | +1315 
99 | -0964 | 1084 | -1193 | -1294 | -1389 | -1479 11565 | 
172 0-80 | 0-0525 | 0-0626  0-0720 | 0:0807 | 0-:0890 | 0-0970 | 0-1046 | 
+85 | 0572 | -0676 0771 | -0861  -3946 | -1027 | -1104 | 
-90 | -0635 | -0742 -0840 0932 | -1019 ‘1102 | -1181 | 
“95 0736 | -0848 0950 | +1045 +1134 1220 | +1301 | 
-99 0953 -1072 -1180 -1280 -1374 1463-1548 
174 0-80 | 0-0519 | 0-0619 | 0-0712 | 0-0798 | 0-0880 | -0959 | -1035 | 
‘85 | -0565 | -0668 | -0763 0851 0935 | +1016 | -1092 
-90 0628 | -0734  -0831 -0922 1008 | +1090 | -1168 
‘95 | -0728 | -0839 | -0940 ‘1034 | -1122 | -1207 | -1287 
| -99 -0943 -1060 +1167 -1266 -1360 1448 | +1532 
176 0-80 | 0-:0513 | 0-0612 | 0-0704  0-:0790  0-0871  0-0949 | 0-1024 | 
“85 -0559 -0661 0755 | -0842 -0925 -1005 1081 | 
‘90 | -0621 | -0726 0822 | -0912 | -0997 | -1078 | -1156 | 
‘95 | -0720 | -0830 | -0929 -1023 -1110 “1194 | +1274 | 
99  -0932 1049-1155 | +1253 | 1345 1433 | +1516 
178 0-80 | 0-0507 | 0-0606  0-0696  0-0781  0-0862 | 0-0939 | 0-1013 
“85 0553 -0654 -0746 -0833 0915 -0994 -1070 
| -90 0614 -0718 -0813 0902 | -0987 -1067 1144 | 
| “95 0712 -0821 -0920 -1012 -1099 ‘1181 | +1261 | 
-99 -0922 -1038 -1143 ‘1240 | -1331 1418 | 1501 | 
180 0-80 | 0-0502  0-0599 | 0-0689 | 0-0773 | 0-0853 | 0-0929 | 0-1003 | 
‘85 | -0547 0647 -0739 0824 0906 | -0984 | -1059 | 
90 | -0607 -0710 -0805 0893 | -0976 | -1056 | -1132 | 
95 -0705 -0812 -0910 -1001 -1087 | ‘1169 | 1248 | 
-99 0913 +1027 -1131 1227 | -1318 | -1404 | -1486 | 
182 0-80 | 0-0497 | 0-0593 | 0-0681 | 0-0765 | 0-0844 | 0-0919 | 0-0992 | 
-85 0541 | -0640 | -0731 | -0816 | -0896 | -0974 1048 
‘90 | -0601 0703 | -0796 | -0883 | -0966 | -1045 | 1121 | 
‘95 | -0697 0803 | -0900 | -0991 ‘1076 | +1157 | +1235 
‘99 | -0903 | -1016 ‘1119 | +1215 -1304 +1390 | +1471 | 
| | | 


| 


10 


0-1170 
-1232 
-1313 
-1441 
-1701 


0-1157 


1219 | 


-1299 
+1425 
-1683 
0-1144 
-1205 
+1285 
+1409 
-1665 


0-1132 
+1192 
*1271 
-1394 
*1647 


0-1120 
-1179 
+1257 
-1380 
-1630 


+1108 
-1167 
-1244 
-1365 


-1613 | 


0-1096 
*1154 
+1231 





+1351 | 


+1597 
0-1085 
-1142 
+1218 
1337 


-1580 | 


0-1074 | 


1131 
+1206 
+1323 
-1565 


0-1063 
-1119 
+1194 
-1310 
-1549 





This table gives the values of x for which Pr (0,,,, <x) = 1,(3; p, q) = P, where p = }(v,— 2), q = 4(%1—2). 
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F. G. Foster 453 
Generalized Beta distribution (cont.) 
x | | | 
| Bebe ier} | | 
| ‘ Sy 3 4 | 5 | 6 7 | 8 9 10 
ia bie paresbinl e wl Sree A oe he | 
P | 
| 184 0-80 -0491 -0587 -0674 0757 -0835 -0910 -0982 1052 | 
| *85 -0535 -0633 -0723 0807 -0887 -0964 -1037 1108 
-90 -0595 -0695 -0788 0874 -0956 -1034 -1109 1182 
95 | -0690 -0795 ‘0891 -0981 -1065 -1146 -1223 1297 
-99 | -0894 -1006 -1108 1202 -1291 | +1376 +1456 | 1534 
186 =. 0-80 -0486 -0580 -0667 0749 0826  ~—s- -0901 -0972 | -1041 
*85 -0530 -0627 -0716 0799 -0878 “0954 -1027 -1097 
‘90 | -0588 -0688 -0780 0865 -0946 -1024 -1098 | -1170 
“95 -0683 -0787 -0882 0971 -1054 +1134 *1211 | 1284 | 
“99 -0884 -0995 +1096 1190 +1278 -1362 +1442 1519 | 
188 0:80 | 0-048] 0-0574 0-0661 | 0-0741 0-0818 | 0-0892 0-0962 | 0-1031 
85 | +0524 -0620 -0708 ‘0791 -0869 -0944 ‘1016 | = -1086 
90 | +0582 -0681 -0772 -0857 -0937 -1014 -1087 1158 
| “95 -0676 -0779 -0873 -0961 -1044 -1123 *1199 | 1272 
| “99 -0875 -0985 -1085 +1178 -1266 -1349 | +1428 | 1504 
| 190 0-80 | 0-0476 | 0-0569 | 0-0654 | 0-0734 | 0-0810 0-083 | 0-0953 | 0-1021 
“85 | -0519 -0614 -0701 | -0783 -0861 ‘0935 | -1006 -1075 
| -90 | -0576 -0674 0764 | -0848 -0928 -1004 -1077 1147 
95 | -0669 ‘0771 -0864 ‘0951 -1033 *1112 -1187 | 1259 
| ‘99 | -0867 -0976 -1075 +1167 +1253 +1336 “1414 | 1490 
192 0-80 | 0-0471 | 0-0563 | 0-0647 | 0-0727 | 0-0802 | 0-0874 | 0-0944 | O-1011 
85 | -0514 -0608 -0694 -0775 -0852 -0926 -0997 -1065 
-90 -0570 -0667 -0756 -0840 -0918 -0994 -1066 1136 
‘95 | -0662 -0763 -0856 +0942 -1023 1101 +1175 1247 
-99 -0858 -0966 +1064 +1155 -1241 +1323 -1401 1476 
| 194 0:80 | 0:0467 0-0557 0-0641 0-0719 0-0794 0-0866 0-0935 0-1001 | 
*85 0508 -0602 -0687 -0768 -0844 -0917 -0987 1055 
-90 | -0565 -0661 | -0749 | 0831 -0910 -0984 -1056 | 1125 | 
95 | +0655 -0756 -0847 | -0933 -1013 +1090 -1164 | 1235 
“99 0849 -0956 | -1054 | +1144 +1229 -1310 *1388 | 1462 | 








This table gives the values of x for which Pr(6,,,, <<) = I,(3; p, 7) = P, where p = }(v,— 2), g = $(Y,— 2). 








[ 454 ] 


STATISTICAL ANALYSIS USING LOCAL PROPERTIES OF 
SMOOTHLY HETEROMORPHIC STOCHASTIC SERIES 


By G. H. JOWETT 
Department of Statistics, University of Sheffield 


SUMMARY. A widening of the concept of stationarity leads to the concepts of smooth hetero- 
morphy and local homomorphy in stochastic series, obviating much of the need for the introduction 
of trend into structural specifications of statistical data. It is shown that formulae for sampling 
properties of local statistics in stationary stochastic series are applicable as they stand to series having 
these less restricted properties. 


1. INTRODUCTION 


In a recent paper (1955a) the author established the principle that the sampling properties, 
and notably the standard errors, of statistics constructed from local comparisons of terms in 
stationary normal stochastic series were approximately deducible from the short term 
variational properties of the series themselves. In making practical use of such statistics, it 
is not necessary to know the mean, variance, or serial correlations of the series, all of which 
are dependent on the long-term variational properties of the series; it suffices to know the 
values of the serial and lag variation functions for short lags, i.e. the mean semi-squared 
differences of terms in the series separated by distances which are short. 

This principle is important because in many practical applications the stretches of series 
which constitute the data are much too short to provide accurate and unbiased estimates of 
long-term variational properties, whereas accurate and unbiased estimates of the Jag varia- 
tion functions for short lags can be obtained even, say, by combining evidence from short 
scraps of series having the same variational properties but different means. 

In recent papers the author has proposed the application of this principle in a number of 
techniques, including the following: 

(i) Linear (1952; Hebden & Jowett, 1952) and spatial (1955a) systematic sampling. 

(ii) Trend-reduced regression analysis (1955). 

(iii) Accuracy of serial variation statistics (1955a) and the fitting of serial variation 
curves (Davies & Jowett, 1956). 

(iv) Systematic linear experimentai arrangements, and the comparison of cycle phase 
means and interpenetrating sample means (1955c). 

(v) The significance of changes in level between successive stretches from the same time 
series (Jump analysis) (1955d). 

In these applications, the long-term variational properties of the series are largely 
irrelevant, since long-term serial variation parameters are effectively differenced away by 
operations involved in the sampling formulae. The question therefore arises of whether for 
application of these techniques and formulae the long-term properties need in fact be 
stationary at all, since their parameters are not considered except in a general sort of way, in 
making an assumption of smoothness of change with increasing lags. It will be shown in this 
paper that so long as certain smoothness assumptions hold, these long-term parameters need 
not be invariant under translation for the formulae to hold (and without modification). 
Furthermore, these smoothness assumptions lead to the definition of a more general class of 
stochastic series which may be termed smoothly heteromorphic, and, in so far as the local 
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parameters of such series are approximately invariant under translation, locally homo- 
morphic. These terms are new, and the second is proposed as being more acceptable on 
semantic grounds than the tautological term locally stationary which is suggested by current 
terminology. It is also useful to use the unqualified concept of homomorphy (contr. hetero- 
morphy) as implying effective invariance of a parameter under translation between certain 
limits only, not under all translations as in stationary series. 

These new definitions are important from the theoretical point of view; smoothly hetero- 
morphic and locally homomorphic series may be incorporated as error terms into models for 
data involving successively recorded elements without the embarrassing theoretical implica- 
tions (e.g. of indefinite potential continuation) implicit in assuming stationary error terms; 
furthermore, they may be used in many circumstances where the assumption of strict 
invariance under translation would clearly not be justified. Their practical importance lies 
mainly in permitting the extended use of formulae already established for stationary series 
in (i)-(v), an extended use for which the formulae require little if any modification. Hence, 
when the occasion for practical use of these formulae arises, the question of whether the 
series involved are strictly stationary often need not even be considered. 

In establishing this as a general principle, the exposition which forms the nucleus of this 
paper has necessarily become rather generalized and abstract; yet since the way in which the 
principle works is fundamentally simple, it will be demonstrated first in relation to a par- 
ticular formula, taken from (iv) and used for a similar purpose in (1955a). 


2. DEVELOPMENT OF IDEAS IN RELATION TO A SPECIFIC EXAMPLE 


Suppose x(¢) to be a random function of t, such as a statistical time series variate for con- 
tinuous time ¢t. For any particular value ¢, of t, x(t,) will be taken to have a probability 
distribution with mean j(t,) and standard deviation o(t,); in general these parameters will 
not be constant for all values of t,. For any pair of valuest,, t,, the covariance cov [x(t,), x(t,)] 
will be a function of both ¢, and t,, not merely, as in stationary series, of their difference, and 
will be related to the serial variation function 


O(t,,t3) = E[d(x(t,) —x(t,))"], (1) 
which is more useful than the covariance in practical work, by the formula 
D (tas tg) = (u(t) — M(tg))? + $07 (tz) + 40° (tg) — cov [x(t,), x(t). (2) 


Asimple statistic U, which might be used to test the significance of the difference between 
two phase means, at interval Ow, (0 < 0 < 4), of a suspected cycle of period w, is given by the 
formula 


nU = (a(T)—2(T + 6w)] + [a(T +w)—a2(7'+w+Ow)j+... 


+[a(T +n—1lw)—2(7'+n—1w+ Ow)]. (3) 
The sampling variance of U is given by 


n*var U = cov {[a(7' + aw) —2(T' + aw + Ow)], [x(T + Pw) —2(T + Bw + Ow)}}, (4) 
a,f=0 


= ' {cov [x(T' + aw), «(T+ Pw)]+ cov [x(T' + aw + Ow), x(T + Bw + Ow)] 
a,f=0 


—cov[a(T' + aw), «(7+ fw+Ow)]— cov [x(T +aw+ Ow), x(T + Bw)]}}, (5) 
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= *S Maw + aw) — p(T + Bw)? + [oT + aw + Ow) — (T+ Bw t+ Ow)? 
— [aT +a) — wT + Bw + Ow) [u(T +a + Ow) — n(T + Bw)P} 
—6(T + aw, T + Bw) —3(T + aw + Ow, T + Bw t Ow) 
+6(T + aw, T+ fwt+ dw) +6(T + aw + Ow, T + fw), (6) 


= "5! {— [WT +a) + wlT +o + Ovo)) [aT + Bro) — wT + Bw + Ow)] 


a, f=0 
—0(T +aw,T + Bw) —08(T +aw+ Ow, T + bw+ Ow) 
+4(T + aw, T + pw+ Ow) + 6(T +aw+ Ow, T + fw)}. (7) 


Tf x(t) were stationary, with serial variation function 6*(7), this would reduce to 
-1 ae a: a Di teoet art 
n2varU = > [3*(a—B—Ow) —26*(a— fw) +d*(2—240w)). (8) 
a, p=0 


Each of the terms of the summation in (8) is a second difference and will be small if 6*(7) is 
approximately linear in 7 over the interval | 7 — « — fw | < 6w, which in practice happens for 
large, moderate and sometimes even quite small values of | a—£ |. The number of terms of 
the summation which are non-negligible is then of order n, not n?, and these effectively 
involve values of d*(7) only when 7 is small compared with nw; evidently, then, the 
sampling variance of U, a statistic made up of the short-term comparisons in square 
brackets in (8), is dependent only on local variational properties of x(¢). The questions now 
arise of how far this remains true if the assumption of stationarity is discarded, and of 
what restrictions of an acceptably realistic kind have to be placed on the probability 
parameters of x(¢) in order that formulae such as (7) should themselves admit of simplified 
expression in terms of such local variational properties. 

In answering these questions we are led to the concepts of local homomorphy and smooth 
heteromorphy. Local homomorphy implies that if we examine the local behaviour of a series, 
i.e. the probability structure of its random variation in short stretches, we shall find that it 
has much the same form from one part of the series to another; smooth heteromorphy 
implies that changes in the form of its structure, even for wider behaviour in longer stretches, 
occur gradually and smoothly as we move along the series. These are concepts which may be 
applied to particular parameters of the probability structure. For example, the function 
d(t,,t,) may be represented geometrically by a serial variation surface. Since the function is 
unchanged by the inversion of its arguments, and is zero when they are equal, it will take the 
form of two sheets which rise symmetrically from a cuspidal edge (or less commonly a line of 
ordinary minima) along the line t, = ¢, in the (t,,¢,)-plane. For series heteromorphic in 
d(t,,t,) the cross-section of the surface in any specified direction perpendicular to the (¢,, t,)- 
plane changes as its point of intersection moves along the cuspidal edge; smooth hetero- 
morphy implies that the change is smooth in some sense, local homomorphy that it is small 
in the neighbourhood of the cusp. For strictly homomorphic series the cross-section does not 
change at all, and this carries with it an implication of symmetry in the cross-section which 
does not hold in general for heteromorphic series. An illustration of some evenly spaced 
cross-sections of a locally homomorphic, smoothly heteromorphic serial variation surface, 
plotted against ¢, —t, as abscissa, is given in Fig. 1. 
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The differencing operation on the serial variation function in each term of thesummation 
in (7) is such that if the function is planar the difference is zero. If the square of which the 
four vertices are the points 
(T+aw, 7+ fw), (T'+aw+ Ow, T + fw+ Ow), (T+aw, T + fhw+ Ow), (T+ aw+ bw, T + fw) 
does not include any part of the cuspidal edge or its neighbourhood, this difference will 
therefore tend to be small, and will decrease as its distance from the cuspidal edge increases, 
i.e. as the serial variation surface becomes increasingly planar. In the contrary case, when 
the square includes any part of the cuspidal edge or is near to it, the difference in (7), while 
not negligible, will tend to be invariant under translation, if there is local homomorphy. We 
are therefore justified in replacing 6(t,,t,) in (7) by a function of separation d* (t, —t,) which 
has a cusp at zero and a geometrical representation resembling a section of the serial 
variation surface such as those in Fig. 1; it could even be a stationary serial variation curve, 
having the same shape at and near the cusp and similar characteristics of smoothness away 
from the cusp. The differencing operations in (7) either yield an approximately correct value 
(near the cusp) or replace one negligible quantity by another. Such a function as 4* (7) seems 








(t,—tg) > 
Fig. 1. Sections of a locally homomorphic serial variation surface. 


appropriately described as an acting serial variation fwction. Extensive local honomorphy 
in any finite neighbourhood of 7 = 0 is clearly not essential for this representation of (7) in 
terms of an acting serial variation function; as long as the true function differs from the acting 
function only by a function which is smooth and changes smoothly under translation the 
discrepancies will be differenced out, and for this limiting local homomorphy is sufficient. 

Smooth heteromorphy of «(t) in the mean s(t) implies that (¢) changes smoothly with ¢ so 
that differences such as [u(7' + aw) —- “(T'+ aw + 6w)] will tend to be small; it would be at 
once realistic and convenient to assume that the products of these differences which occur in 
(7) are of the same order of magnitude as the effects of smooth heteromorphy on the dif- 
ferences of serial variation parameters. Provided that the smoothness is sufficient, we may 
then use the formula (8), which is exact for stationary series, as an approximation to (7), 
interpreting 6* (7) as an acting serial variation function. The resulting error in var U is then 
merely the average of the errors in the separate terms of (8). 

An important practical consequence of this and other similar results is that the assump- 
tion of stationarity involved in the specification of structure of observed data may be 
regarded as ‘robust’, applying Box’s concept of robustness (Box (1953)) to this different 
type of situation. Departures from stationarity will often be in the direction of smooth 
heteromorphy, and physical conditions often justify this assumption. The serial variation 


statistics d(r) = Av }[x(t) —x(t+7) 2, (9) 








458 Statistical analysis 


obtained by averaging semi-squared differences, can reasonably be used as a basis for esti- 
mating 6* (7); this function may be fitted to them as for stationary series, since they too have 
sampling properties which may be shown to be robust under smoothly heteromorphic 
departures from stationarity; furthermore, if x(¢) is locally homomorphic, they will have 
real descriptive meaning for the smaller values of 7. Tests and measures of heteromorphy 
would not be difficult to devise, but for the practising statistician they would fill only 
a rather subsidiary need, like tests of normality in classical statistics; he would usually be 
content to assume smooth heteromorphy in the absence of conspicuous evidence against it. 

Again, the concept of smooth heteromorphy reduces the need to incorporate the somewhat 
indeterminate notion of trend into specifications for data, particularly when such trend 
would not be evolutionary in character, but merely a smooth slow movement with unspeci- 
fied variational properties. It is not usually possible to distinguish such a trend from the 
longer term aspects of serial correlation in observed data; fitting it is apt to be an arbitrary 
procedure, and in analysis parameters have to be introduced to specify it. Accordingly, it is 
useful to be able to dispense with it, in this local type of analysis at least; in the specifications 
required it is usually possible to replace trend plus stationary component by a single 
smoothly heteromorphic component. 

The concepts of smooth heteromorphy and local homomorphy are readily generalized to 
concomitant variables, spatial series, and parameters of higher order (leading to the concept 
of local normality); such a general treatment is given in the next section. 


3. GENERAL THEORY 


For simplicity of notation and exposition, the general theory will be developed only for the 
case of a univariate stochastic series x(t). The ideas and proofs, however, extend readily to 
the case of a vector random function x(t) with components x,(t), 7,(t), ... defined at points t 
of a multidimensional space S (cf. Jowett (1955a)), there being no concept which is not 
obvious that occurs in passing to a vector function and to a higher dimension. 

The means ,(t), standard deviations o(t), covariances cov (x(t,), x(t,)) and serial variation 
functions d(t,,t;), are all instances of the general concept of probability parameter function 
A(t, te, ...) defining any specified parameter of the multivariate probability distribution of 
x(t,), x(t), ... as a function of t,,t,,...; this concept also covers higher order cumulants. 

Definition of smooth heteromorphy: The stochastic series x(t) is smoothly heteromor phic in the 
probability parameter function A(t,, te, ...,t,) of m( > 2) phase values t,, ty, ...,t, in the interval R 
if, for all t,,t,, ...,t,, in R, there exist a function A*(t, — ty, tp —ts, ...) of separation only such that 
A—A* and its first and second derivatives with respect to t,, ty, ... are absolutely bounded in R by 
constants M),, M,,, M)z, respectively. The smoothness of the heteromorphy is characterized by the 
smallness of M,,, My, which can be achieved by suitable choice of A*. 

For functions of a single phase value, A* will be taken as a constant. 

oo Orta, ty...) A(t» bys ---) —AM(, — ty, tg — ty, ---), (10) 
this definition implies that if a is a constant having the dimensions of ¢ and such that 
|7,|,|72|,... <a, and if the interval {(t,,t,, ...), (t, +71, ta +7, -..)] belongs to R, A admits of 
a representation in the form 


0 7) 
A(t, +71, to +72, aan = A*(t,+7,—te—To, wee) + (lina +25 t .) a(t,, te, eee) 
1 


+ 4m2a?O, (ty +745 t2 +72) ---) Mya, (id) 
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where | A, (t, +7 , to +7, ---) | <1,a@< Myo, dx/Ot,, ... < M,,. The function A* will be called an 
acting parameter function; it need not itself possess bounded second derivatives everywhere 
in R. 

If x(t) is smoothly heteromorphic in all cumulants of the rth and lower orders, it will be 
described as smoothly heteromorphic to the rth order. In many applications, smooth hetero- 
morphy to the second order is all that is necessary, and this can easily be shown to imply 
smooth heteromorphy in the serial variation function 4(t,,t,), leading to the concept of an 
acting serial variation function 6*(r), which in practice will usually be found to have a cusp 
at 7 = 0. 

We shall be concerned with intervals of t of width 2a (later to be equated to twice the 
constant a mentioned above), and with seminvariant linear local functions (s.1.Lf.’s) L,(¢) of 
diameter 2a defined by Stieltjes integrals as follows: 


+a 
1,(t)= |" a(t+7)dl,(r), (12) 
+A 
where | dl,(7)=90 (all A>a), (13) 
-A 
justifying the use of the term seminvariant. With any s.l.1.f. is associated its absolute 
coefficient sum ‘in 
H, = | | dl,(7) | (14) 
and first absolute coefficient moment 
re 1 fre 
K,=5,| |r| lala). (15) 


THEoreEM 1. Jf, for [t, —a,t,—a, ...), (4; +4, : a,...)] <R, the heteromorphy as series x(t) 
in the probability parameter function A(t,, ty, ...,t,) is sufficiently smooth and if L(t), Lp(te), -- 
are m localized seminvariant linear local fandlions of sufficiently small maximum Sinaia 2a 
having finite absolute coefficient sums H,, Hy, ..., then 


| i. eX (t) +745 tg + 7p, «+) dl, (7y) A y(T9)-.- 


—a —a 


-{"*.. A*(t, +7, --tyo—Te; alr) Adylr) <e, (16) 


where ¢€ is a preassigned small positive quantity. 

Proof. When the right-hand side of (11) is substituted in the left-hand integral in (16), the 
seminvariant property of the L’s ensures that all terms not involving all of 7,, 75, ... yield 
integrals which are identically zero. 


| '; fe a ik }n2a?O(ty +7, ty +7. ---) Mygdl,(7,) l(t) -. Js jm?a?My,H, Hy... (17) 
and the quantity on the right-hand side of (17) will be less than ¢ if M,, and a are small 
enough. The theorem follows. 

In Theorem 1 and subsequent theorems we are not in fact thinking of ¢ in the customary 
way as a quantity ultimately to be made vanishingly small, but as a quantity which must be 
acceptably small for the theorems to be applied, its size being governed jointly by the 
smoothness of the heteromorphy of a given series and the smallness of diameter of the s.1.1f.’s 
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involved in the statistical analysis which it is proposed to apply to it. Sometimes, as for 
example in Jump analysis (Jowett (1955d) and Jowett & Wright (in preparation)), it is the 
diameter of the s.1.1.f.’s which has to be made sufficiently small to make the theorems apply 
in the case of a given series, and thus make statistical analysis relevant to some issue possible 
at all. 


THEOREM 2. If, for [(t,; —a,t,—4, ...), (4, +4, t,+a, ...)] <R, the heteromorphy of the series 
x(t) is sufficiently smooth to the second order, the sampling covariance of any pair of 8.1.1. f.’s L,(t,), 
L,(t,) localized at points t,, t, of R and having finite absolute coefficient sums H,, H, and first 
moments K,, K, may be made to satisjy an inequality of the form 





cov (Lath) Lylt)— |" [ "attr dleirdylr»)|<e, (18) 


where 5*(7) is an acting serial variation function of separation only, a is the greater of the semi- 
diameters of the two s.l.l.f.’s, and € is a preassigned small positive quantity. 
Proof. 


cov (Lath) Lylt)] =|" |" cov rth +70) 2lt,+ 14) dlr) dlr) (19) 


-| | {307 (t, +71) + $07 (tg +72) — A(t, +71, te +72) 
—aJ —a 
+ $[m(t, +71) — M(t, +72)]?} dl, (74) dl,(T2), (20) 
=f" [fo trite 1a) + t+) — ita + TaD Ml) Agrg) (2) 
—aJ —-a 
the integrals of the terms involving 7, or 7, only vanishing because of the seminvariant 
property of the s.L.1.f.’s. 
In (21) we may expand the square. Because of the seminvariant property, the integrals 


of the terms in w(t, +7,) and y?(t,+7,) vanish. Hence, using (11) and the seminvariant 
property, 


+a f+a 
cov [L, (4), Lg(te)] -” [ i {—8*(t, +7, — te —T_) — 2a? Myo 95 (ty +71, tg +72)} Al, (73) dl (72) 


7 [- {r et ay + 307M O,(ty + n)| dl,(T,) 


x | * {7 ta +72). 402M 2 8,(ta+ 73] dlj(rs). (22) 
—a 2 


It follows that 


+a (+a 
cov [Latt) Lpltad—[ "f° —9¥t, +71 Fy 7a) dlalry Ayr 





< 2a°M;, H, H, + (2aK, + $a°M,2H,) (2aK, + 3a°M,2H,). (23) 


If Myo, M,. and a are sufficiently small, the right-hand side of (23) will be less than e. This 
proves the theorem. 

Definition of acting normality. The stochastic series a(t) possesses the property of acting nor- 
mality in the interval R if it is smoothly heteromorphic in R with acting cwmulants of order 
higher than the second which may be taken as zero. 
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THEOREM 3. If in an interval R the heteromorphy of x(t) is sufficiently smooth to order s, and 
if x(t) has the property of acting normality, then for any set of s s.L.1.f.’s with bounded absolute 
coefficient sums and first moments, with diameters not greater than 2a, and involving only phase 
values lying in R, 


ElLz,(ty) La,(ts) «+ Lag(t,)] = bin Cor +++ Gna + He . ig (24) 
where Cy = {" We —d*(t,+7,—t,—T,) dl, (7,) dl,,(7,); (25) 


and the summation is taken over all partitions of the suffixes 1,2, ...,8 into different pairs. 
Proof. Let x(t,t,...t,) be the cumulant of highest order associated with the product 
a(t,,) x(t,,)...«(t,), repetition of arguments being permissible. Thus, for example 
k(t,¢+7) = cov (x(t), a(t+7)), K(t, t) =o°(t). 


From the relationship between the moments and cumulants of a multivariate distribution, 
it may easily be shown that 


E(a(t,,) x(t,) ...2(t,)) = Uk(t,, -.-) K(.--) .-K(..-)s (26) 


where the summation is of all products of cumulants obtainable as follows: the symbols are 
partitioned into one or more subsets; the symbols of each subset are taken as the arguments 
of a cumulant; and all the cumulants resulting from the particular partitioning are multi- 
plied together. Thus, for example, 


E(X(ty) U(ty) L(ty)) = K (tus bys bay) + K (by) K (los bay) + (bs by) K (lay) + K (ts bro) K (bp) + K (bu) K(ty) K(bw)- 
(27) 


The constitution of the summation is unaffected by coincidences in the actual values 
tys ty) ---,t,. Now 


ELL, ,(ty) L,,(te) --- La,(ts)] 


a | oa [7 Ble ty +74) (ty +74) «-- 2(ty+7,)] ly, (74) «+ dle, (t,), (28) 


-(™...™ [Ex(ty + 7p, 0+) K(by + Tips «+-)o+e] Uy, (7y)-++ Uy, (T5)s (29) 


=3|[" [ett .) dl,,(14) «- aie [aby + ty) dlaglty) |, (30) 


the summations in (28), (29), (30) being defined as in (26). If the heteromorphy of x(¢) is 
sufficiently smooth, and a sufficiently smali, every integral in (30) may be replaced, to the 
order of a given e, by that of an acting cumulant which is a function of separation only. 
Furthermore, if x(t) has acting normality, only the acting cumulants of the second order 
survive. Hence, if we neglect terms of order e, the only terms of (34) which make a contribu- 
tion are those consisting entirely of products of integrals of second order cumulants. Now 
typically 


4 f k(t, +Tq,t,+7,)dt,d1, = cov L, (ty ), L,,(t,) (31) 
—aJ —a 


= f f — 8*(t,+7,—t,—7,) dl, (74) dl,,(t,)+0(e) (32) 








462 Statistical analysis 


by Theorem 2. The proof of the theorem then follows at once. Formula (32) differs from the 
corresponding formula for stationary series, established in (1955a), only in the presence of 
the term O(e) and the use of an acting, instead of actual, serial variation function of separa- 
tion only. Accordingly, if 6*(7) has sufficient tendency towards linearity with increasing | r |, 
the methods of that paper may be applied and lead to the following theorem, which is 
a suitably modified form of the theorem established in § 2 of that paper, and which is proved 
in the same way: 


THEOREM 4. The expectation of any product of powers of s.l.1.f.’s of maximum diameter 2a 
of values in an interval R of a stochastic series x(t) having, in R, 

(a) heteromorphy which is sufficiently sv oth, 

(b) acting normality, 

(c) an acting serial variation function which has a sufficiently rapid tendency to linearity as 
the separation |7| increases beyond a, 
may be expressed as a sum of terms which are either of magnitude at most of specified order e, or 
involve only values of the acting serial variation function in a neighbourhood of |7 | = 0 having a 
magnitude of order a, t.e. depending on local variational properties of x(t). 

Many statistics which are of practical interest are means, or functions of means of products 
of powers of n s.1.1.f.’s which are evenly spread over a region of S and their asymptotic 
sampling moments depend ultimately on the n* covariances of the s.1.Lf.’s involved in them. 
Because of the even spread, the number of these which are not far apart (a term defined 
precisely in (1955a)) is of magnitude O(n). If x(t) has the properties specified in Theorem 4 it 
is only the covariances of these which are not negligible, and these may be expressed in 
terms of what are essentially local variational properties, as indicated in Theorem 4. 
Following the arguments of (1955a) we may assert the principle that for smoothly 
heteromorphic series the sampling properties of local statistics depend essentially on local 
properties; in general the same sampling formulae may be used as in stationary series, with the 
acting serial variation function playing the role of the serial variation function in stationary 
series. 

It will be observed that none of the theorems explicitly involve the concept of local 
homomorphy, which was mentioned in § 2; it is not necessary for their application that A* 
should approximate to A in any sense. Nevertheless, local homomorphy is a useful property 
for a series to have; it is consistent with smooth neteromorphy, and implies the possibility of 
such an approximation, namely that A* may be chosen to approximate to A when the separa- 
tions of the arguments (t,, f,, ...) involved are of order not greater than a. Accordingly it is 
worthy of formal definition, which is conveniently given in terms of the corresponding 
limiting property. 

Definition of limiting local homomorphy. The stochastic series x(t) is locally homomorphic in 
the limit in the interval R if for any €> 0 there exists a magnitude +, depending on e, such that 
whenever [t,, ty, ...,t,,] lies in R and can be included in some intervul of width 2a, an acting 
probability parameter function A*(t,—t,, ...) can be found which is such that 


| A(ty, te, ---) —A*(t, — te, -..) | <e. (33) 


Limiting local homomorphy will be taken to imply some degree of local homomorphy, and 
carries with it the possibility of actually approximating to A by means of A*; the extent of the 
local homomorphy is characterized by the nature of the approach to the limit, a large value 
of a for a given ¢ implying greater local homomorphy than a small one. 
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For some probability parameter functions, smooth heteromorphy is sufficient for limiting 
local homomorphy. An important instance is the serial variation function 4(¢,,t,). The 
definition of smooth heteromorphy for [¢,,¢,] eR permits us to choose 5* so that d*(0) = 0. 


Writing A(t,, te) =4(t,, te) — O*(t, —t,), (34) 


from the boundedness of the second derivatives of ¢ it follows that for some ¢;, t; lying in the 
interval t,, t, 
7 O¢ ,., x.) ; 
bltaste) = (ts) EP et) + a—-DE t) (35) 
1 2 


where f = $(t,+,). If |t,—t,| < 2a, where a = ¢/2M;,, 

| A(t, te) | < 2aM;, =e, (36) 
thus establishing limiting local homomorphy in 6. On the other hand, consideration of the 
case 

A(ty, te, ts) = 3(t, te) M(t) (37) 


shows that smooth heteromorphy is not always sufficient to secure limiting local 
homomorphy. 
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DEPENDENCE OF THE FIDUCIAL ARGUMENT 
ON THE SAMPLING RULET 


By F. J. ANSCOMBE 


Princeton University 


Suppose that we have a series of observations, given to have been drawn independently 
from a common chance distribution of specified form depending on a single unknown para- 
meter 0. Suppose the observations have been taken according to some sampling rule such 
that the eventual number of observations does not depend on @ except possibly through 
the observations themselves. All recognized kinds of sequential rule, including fixed- 
sample-size rules, satisfy this condition. (A type of sampling rule to be excluded would be 
one where the number of observations depended on other observations subsequently 
suppressed. ) 

It is well known that the likelihood function of 0, giver the observations, is independent 
of the sampling rule. For in any expression for the chance of observing what has actually 
been observed, the sampling rule only enters as a factor independent of 0, which is therefore 
irrelevant when the chances corresponding to different values of 0 are compared. It follows 
that the posterior probability distribution for 6 derived by Bayes’s theorem from some 
given prior distribution is also independent of the sampling rule. (I shall refer to this use 
of Bayes’s theorem as ‘the Bayesian argument’, for short.) 

On the other hand, in order to make a significance test (for instance, a test of whether 
the specified form of parent distribution is reasonably compatible with the observations), 
we need to take the sampling rule into account, in general. The same is true of other common 
statistical procedures; for examples and references, see Anscombe (1954). 

The purpose of this note is to show that in order to apply Fisher’s fiducial argument (as 
recently re-expounded by him, 1956) we need to consider the sampling rule, just as for 
significance tests. This is interesting for two reasons. 

First, Fisher has stressed that a fiducial distribution is to be interpreted in exactly the 
same way as if it had been derived by a Bayesian argument, and indeed a fiducial distribu- 
tion may be taken as prior distribution for incorporation in a subsequent Bayesian argu- 
ment. Moreover, when the fiducial argument is for mathematical reasons not available, 
Fisher suggests that the likelihood function may be considered directly instead. Thus, what 
the fiducial argument is likened to and intimately associated with, and what is suggested 
as a substitute for it when it is not available, both differ from it in regard to dependence on 
the sampling rule. 

Secondly, Lindley (1957) has shown that, if the common chance distribution of the 
observations referred to in the opening sentence above is a member of the exponential 
family, then the fiducial argument does not have a property of consistency evidently desired 
for it by Fisher unless the resulting distribution for 0 is actually a posterior Bayes distribu- 
tion. The examples below show that, in so far as fiducial distributions can be identified 
with posterior Bayes distributions, the corresponding prior distributions depend on the 


+ Prepared in connexion with research sponsored by the United States Office of Naval Research. 
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sampling rule. It is also pointed out that the same applies (in a more striking degree) to 
decision procedures chosen by the minimax rule, when these are expressed in Bayesian 
terms. 

Thus, Fisher’s hierarchy of inferential methods, namely (in ascending order of informa- 
tiveness), (i) significance tests, (ii) contemplation of the likelihood function, (iii) the fiducial 
argument, shows a curious oscillation between heeding and ignoring the sampling rule. 
Items (i) and (iii) require for their definition that the whole sample space be defined; and 
this property is shared with Neyman’s confidence intervals, the sampling distribution of a 
statistic, and Wald’s minimax decision procedures. But for item (ii), as for the classic use 
of Bayes’s theorem, we require to know no more of sample space than the single observed 
point. It seems to me that, before Fisher’s account of statistical reasoning can be accepted 
as correct, this oscillation must be explained and made acceptable. More generally, anyone 
who claims correct understanding of statistical inference needs to explain when, why, and 
how, knowledge of the sampling rule is relevant to the interpretation of given observations. 


Examples showing dependence of the fiducial argument on the sampling rule. In constructing 
such examples, the difficulty is encountered that when the sample size is a random variable 
its distribution, being discrete, cannot be made the basis for an exact fiducial argument, 
though for large sample sizes an approximate fiducial argument may be possible. In order 
to avoid this kind of imprecision, the examples below refer, not to ordinary sampling with 
a finite number of observations, but to the sampling of a stochastic process with continuous 
time parameter. Large samples from (respectively) binomial and normal populations will 
exhibit approximately the same phenomena. 


Example 1. Let r,(t) denote a Poisson process with continuous time parameter ¢ and 
independent increments, such that 
rolt +7) —ro(t) 
has a Poisson distribution with mean 67, for any positive 7 and any ¢. Thus the jumps in 
r,(t) occur at mean rate 6 per unit time. We suppose 6 to be positive but otherwise unknown. 
Starting with r,(0) = 0, let a realization of the process be observed continuously for a period, 
at the end of which ¢ has the value 7' and r,(t) has the value R, say. The likelihood function 
of 0, given the observations, depends only on the end-point (7', 2), beingt 


OR e-0T, (1) 

Suppose we are told that the duration of observation, 7’, was fixed in advance. Then R 

is the observed value of a chance variable (for fixed @) having the Poisson distribution with 
mean 67’. No exact fiducial distribution for 0, given R, can be found, but if R is large we 


can use an approximate fiducial argument (Fisher, 1956, pp. 62-3), asymptotically equi- 
valent to the use of Bayes’s theorem with likelihood (1) and prior distribution} 


do 
Jo" 
Suppose, aiternatively, we are told that sampling was ‘inverse’, R was fixed in advance, 
and observation stopped at the first instant when the condition r,(¢) = R was satisfied. 


(2) 


+ The likelihood function is a set of odds for 0, and is therefore arbitrary up to a multiplying factor 
independent of 6. 

t Prior distributions need not be normalized, since constant factors cancel out when Bayes’s theorem 
is used. 


30-2 
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Then 7' is the observed value of a chance variable (for fixed @) having the distribution 


peo TR-1 ¢-OT GT. (3) 
(R-1)! 
This inverts (see Fisher, 1956, p. 53) to give a fiducial distribution for 0, given 7’, namely 
_ 2" gre dg (4) 
(R-1)! { 
which is the same as the posterior Bayes distribution for 0 derived from (1) with the prior 
distribution dé 
a (5) 


Example 2. This is similar to Example 1 except that now both interpretations of the 
observations lead to proper fiducial distributions. Let 2,(t) denote a normal (Wiener) 
process with continuous time parameter ¢ and independent increments, such that 


a(t +7) — Xo(t) 


is normally distributed with mean 67 and variance 7, for any positive 7 and any t. Thus 
2,(t) increases at the rate 0 per unit time in mean and | per unit time in variance. We suppose 
@ to be unknown. Starting with x,(0) = 0, let a realization of the process be observed con- 
tinuously for a period, at the end of which ¢ has the value 7’ and 2,(t) has the value X, say. 
We shall suppose that X is positive. The likelihood function of 0, given the observations, 
depends only on the end-point (7', X), being 

eOX-3PT (6) 


Suppose we are told that the duration of observation, 7', was fixed in advance. Then X is 
the observed value of a chance variable (for fixed @) normally distributed with mean 67 
and variance 7’. This inverts to give fiducial distribution for 0, given X, namely the normal 
distribution [ bes 

J (=) etT0-Xi7? dp, (7) 
which is the same as the posterior Bayes distribution for 0 derived from (6) with the uniform 
prior distribution do. (8) 


Suppose, alternatively, we are told that X was fixed in advance, and observation stopped 
at the first instant when the condition x,(¢) = X was satisfied. Now this specification of the 
sampling rule is not adequate as it stands, unless there is prior knowledge that 020; 
for if 9 < 0 the chance that the positive value X will ever be attained by the process is less 
than one (in fact e??*). In the absence of prior knowledge that 6 >0, the sampling rule 
may be truncated, thus: observation stops at the first instant when either x,(¢) = or 
t = k, wherecand kare given positive numbers. If (7', X) denotes the end-point of the sample 
path, either X = cor7' =k. Let T* = T+c—X, 


so that 7* = Tif X =cand7* = k+c—X ifT =k. Then clearly 7* is a sufficient statistic 
for @. It is shown below (next paragraph) that the chance distribution for 7'*, given 0, has 
the monotonicity property needed for the derivation of a fiducial distribution for 0, given 
T* (Fisher, 1956, p. 69). Note that for any sample path such that 7'<k, the value of k is 
not needed in the fiducial argument, which is based on the cumulative distribution function 
of 7'*, given @, at the observed value of 7’. 





It is 


and 
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To verify the monotonicity property of the distribution of 7'*, let F,(x) denote the chance 
that 7'* <x. Then F,(z) is the chance-measure of all sample paths in the interval 0<t<k 
for which 7* <x. Let each of these paths be deformed from 2,(t) to x,(t)+ at, where a is 
a positive constant. All the deformed paths satisfy 7'* < x, and if 0 is changed to 0 + their 
measure is still (7). In general there are also other paths, besides these deformed paths, 
for which T* <x. It follows that F,,,,(x) > F,(x), ie. F,(x) is an increasing function of 0 as 
well as of x. For given positive x, F,(x) runs from 0 to 1 as 6 runs from — o to +00; 
for given 0, it runs from 0 to 1 as x runs from 0 to oo. 

Suppose, then, we are told that the given sample path was observed under a sampling 
rule of the sort just described, and that x,(¢) attained its prescribed upper bound (therefore 
equal to X) before the truncation provision came into effect. The chance distribution for 
T, given 6 and X, is (see Feller, 1950, p. 296) 

_xX- T-3 e-4T0-XITY IT (9) 

Jen) } 
(The integral of this from 0 to the truncation time k is the chance that the process attains 
the bound X before truncation.) The fiducial distribution for 6, given 7’, derived from (9) 
is not a posterior Bayes distribution for any prior distribution, because the condition given 
by Lindley is not satisfied (namely, that there exist transformations of T to U and of 6 to ¢, 
such that ¢ is a location parameter for U). However, we shall see that if X is large the 
fiducial distribution is approximately a posterior Bayes distribution.+ 

If we integrate (9) with respect to 7’ and then take the differential with respect to 6, we 
obtain the fiducial distribution for @ in the form 


> a , 
d0| ———t-3(X — 6t) e-°-X* dt. 
| 0 V(27) ( ” 


After changing the variable of integration from ¢t to ,/t(@+X/t), we obtain the explicit 

expression 2X 2X0 @(— V) db, (10) 
t 

where V=/J7T(0+X/T) and ®t) -| e~ 4 dt/,/(27). 


We nowrestrict attention to valuesof # near X/7', according to the asymptotic specification: 
X+>o, T’'>oo, with X/T constant, ./7'|@—X/7'| bounded. On using the asymptotic 
expansion for ®(— V) as V +00, we see that (10) is 


1 2x 1 
9-4 7T(0-X/T)* — 
Jen v° ( +0(7)}ae 
It is easy to show that y2 l 
—=0 + Ol — 
ax ~°+O(7), 


and so finally we have the fiducial distribution for 0 in the form 


[qr {+0(2) a0 an 


+ The following proof, shorter than my original one, is due to Dr H. E. Daniels. 
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This is asymptotically the posterior Bayes distribution for 0 derived from (6) with the prior 
distribution dé 

Nie 
(This has been established for values of # near X/T7'. The insertion of the modulus sign in 
the denominator extends the result to include what we should have obtained if we had 
supposed X to be negative.) 


(12) 


Remarks. (1) In Example 1, if we had not been given the complete observational record, 
but only the end-point (7’, 2), to be told that the sampling rule was the inverse one would 
give us an extra piece of information about the sample path, namely, that the ordinate 
jumped from R—1 to R at the final instant 7. This extra information is reflected in the 
fiducial argument (when we translate it into Bayesian terms) by the substitution of the 
prior distribution (2) by (5), which has the effect of attaching more weight to lower values 
of 6. Thus, the fiducial argument does not use the observations only in the form of the like- 
lihood function, the ‘sufficient’ statistics (7',R) are not indeed sufficient for it; further 
information about the observations, not contained in the likelihood function, is required.+ 
This, of course, is in contrast with the classic use of Bayes’s theorem, where the prior dis- 
tribution, however it may be determined, is wholly independent of the observations used 
to calculate the likelihood function. (The same remark, with obvious changes, applies 
equally to Example 2.) 

This point applies to other kinds of statistical argument besides the fiducial; for instance, 
to minimax decision procedures. From the mere fact that a procedure can be expressed in 
terms of Bayes’s theorem with some specially-chosen prior distribution, we cannot infer 
that the procedure ignores the sampling rule, because the ‘prior’ distribution may depend 
on the sampling rule (and so not be prior at all). This is illustrated by the minimax pro- 
cedure for estimating a binomial chance @ from a sample of fixed size n, when the loss in 
quoting the estimate bis proportional to (6 —6)?. The minimax 4 is what would be obtained 
by the Bayesian argument if the prior distribution for 0 were 

[A(1 —A)}#v"-1 48, 
which depends on n. (See Hodges & Lehmann, 1950.) 

(2) Another way of regarding the examples is as problems in recognizing ancillary 
statistics. In Example 1, should we think of 7' as ancillary to R, or of R as ancillary 
to T'? ; 

(3) It is of some interest to inquire how near the above examples come to being unique 
of their kind. In both cases, other sequential sampling rules besides the ‘inverse’ one could 
have been considered (mathematical difficulty permitting), leading to yet other fiducial 
distributions. Apart from that, the examples do seem to be unique, in the sense that the 
Poisson and normal processes are (I believe) the only processes with continuous time para- 
meter and stationary independent increments from which examples of similar simplicity 
could be constructed. Examples 1 and 2 both have the properties: (i) the end-point of the 
sample path is ‘sufficient’ for the unknown parameter 9, (ii) there are sequential sampling 
rules defined by a boundary such that (with chance equal to 1) the sample path attainsthe 
boundary exactly without jumping over it, and the time at which the boundary is attained 
is a ‘sufficient’ statistic for 0. Both these conditions are needed for simple application of 


t A statistic that determines the likelihood function has been termed ‘sufficient’ by Fisher. The 
definition carries the innuendo that the likelihood function is all we need to know. 
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the fiducial argument. Condition (ii) is not satisfied unless the process is either continuous 
(and therefore normal) or purely discrete, all jumps having the same magnitude (as with the 
Poisson process). A process of this latter sort, having negative as well as positive jumps, is 
the difference of two independent Poisson processes. There are now two parameters. 
However we select one parameter to be estimated, the process cannot (I think) be made 
to satisfy (i). 
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TESTS FOR RANK CORRELATION COEFFICIENTS. I 


By E. C. FIELLER, H. 0. HARTLEY anp E. 8. PEARSON 


Statistical Advisory Unit, Ministry of Supply, London ; 
Iowa State College, Ames ; University College, London 


1. PURPOSE OF THE STUDY 


(1:1) The measures considered 


The following is a first report on an investigation which became possible with the avail- 
ability of the 25,000 sets of correlated random normal deviates, 3000 of which were pub- 
lished in Fieller, Lewis & Pearson’s (1955) T'racts for Computers, no. XXVI. The object 
which we set ourselves was to study with the aid of these data the sampling distributions of, 
and relationships between, three measures of rank correlation, in the case where the basic 
variables which have been ranked follow bivariate normal distributions, 

We shall use the following notation. Suppose that there are n pairsof associated rankings 


,, Mg, -.05 Hy BNA V4, Ug, ..., Ua; 


where the integers w; (i = 1,2,...,) may be taken in ascending order 1, 2,..., and the 
v,; are a permutation of these integers. We shall consider in the present paper the two 
following measures of correlation between these rankings: 

(a) Spearman’s coefficient which we denote by rg. This is simply the product moment 
correlation coefficient of u;, v; and may be computed from the sum of squared differences 


Ss = E (uo) (1) 
where rg = 1—68,/(n?—7n). (2) 


(b) Kendall’s coefficient, 7, which we denote by rg. This may be computed as follows. 
For every integer u; count the number of v,; with v; > wu; and j >7; then add these counts to 
obtain the positive score P,. Then 


Ye = 4Px/(n?—n)-1. (3) 


Both rg and rx lie between +1 and —1. We shall not be concerned here with ties among 
the w’s or v’s. 

The following is a third coefficient which has been computed for all the sampling data 
and which we hope to consider later: 

(c) The Fisher-Yates coefficient. Let £(i |) be a so-called normal order statistic, i.e. the 
expected value of the ith largest standardized deviate in a sample of m observations from 
a normal population. Then we may attach these score values to both the w rankings 
and the v rankings. Fisher & Yates (1938, p. 50) have suggested that a measure of rank 
correlation might be obtained from the product moment correlation coefficient of these 
scores, namely 


ry = 3 Ki | mpeto|m)/ SEG | m. J 
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Convenient tables of the individual £(i | ) as well as of > £7(i |”) are given, for example, 
i 


in Fisher & Yates (1938, Tables XX and XX1I). As an approximation to the actual product- 
moment correlation coefficient, r,,,, in a normal sample r, clearly has much to reeommend 
it; but the only discussions of this coefficient of which we are aware are those by Jeffreys 
(1948, pp. 209-10) and Hoeffding (1951, pp. 86-9). 


(1-2) Some known results on the distribution theory of rg and rx 

For a comprehensive summary of the older results, the reader may consult Kendall 
(1948) and Moran (1950). Briefly these are as follows: 

For independent random rankings (i.e. for random permutations of the v;) the complete 
distributions of rg and rx have been obtained for small n by combinatorial enumeration. 
Adequate approximations have been evolved for larger n. 

In the case of correlated rankings it is first necessary to specify the nature of the depend- 
ence. A discussion of this problem of appropriate population models was given by Daniels 
(1950), and very recently Mallows (1957) has developed a new form of approach related to 
paired-comparison theory. In the present paper we start from the assumption that the 
n pairs of rankings w;, v; have arisen as the rank numbers in a sample of n pairs of correlated 
normal variates. Thus, if x;, y; (¢ = 1,2,...,m) denote a random sample of ” paired obser- 
vations from a bivariate normal population having correlation coefficient p, we suppose 
that the x; are arranged in order of magnitude and that »v; is the rank of y;. This model 


has received considerable attention and a certain number of theoretical results are known. 
Thus we have 


6 
g ea | mh in-} 5 
E(rs) ina Toe p+(n—2)sin-1}p} (Moran, 1948), (5) 
var (7g) = 1/n{1 — 1-563465p? + 0-304743p4 + 0-155286p* + 0-061552p8 + 0-022099p" + ...}. 
(6) 
Equation (6) is a large sample approximation due to Kendall (1949) and David, Kendall & 
Stuart (1951). As we shall see below, it does not appear to be very accurate when the 
sample size is as small as 10. Turning to rg, we have 


‘ E(rx) = “sin-t (Greiner, 1909), (7) 


sia 1 I: 2 (- a" p) + 2(n —2) f ai (; sin-} iv) | (Esscher, 1924). (8) 


As far as we are aware, no results are available for the higher moments or cumulants of 
rg Or 7, but Sundrum (1953) showed how the third and fourth moments of rz might be 
obtained in the general case. He also used some random sampling results to give empirical 
values for these moments, assuming underlying normal correlation, inthesingle casep = 1/,/2. 

As can be seen from equations (6) and (8), the standard deviations of rg and rz change 
with p. Further, as might be anticipated from the parallel case of the product moment 
correlation coefficient r,,,, the shape of the sampling distributions are found to change 
with p. Thus, when we get away from the problem of using rank correlation coefficients in 
tests of independence, we at once run into difficulties. The lack of results for dependent 
rankings has made it difficult to compare the relative merits of different rank coefficients 
in detecting dependence, nor has it been possible to use these coefficients for a comparison 
of correlation in different populations. If we accept the underlying bivariate normal struc- 


var (7x) = 








472 Tests for rank correlation coefficients. I 


ture, then we are faced with the distributional problem; if we do not accept this, then we 
have also to look for a simple definition of non-parametric dependence. 


(1-3) The present results and their bearing on these difficulties 
While we do not claim to have solved all these difficulties we hope, in this paper, to have 
compiled evidence which shows that the problem is capable of a simple solution provided the 
rankings arise from the class of population models specified below. We proceed as follows: 
(A) We start with rankings generated by sampling from a bivariate normal parent with 
correlation p. With the help of extensive sampling experiments backed by analytical 
approximation, we show that if n is not too large the z-transforms 


1l+1fg 
l-‘rg 





zg = tanh'r,g = }log, » 2g =tanh"'r, (9) 


are approximately normally distributed with variances nearly independent of p. In fact 


var (2g) — var (2x) ~ ad : (10) 
The expectation of z, can be expressed approximately as a simple function of p, making 
use of the expressions for &(r) and var (r) given in (7) and (8). The approximation to the 
expectation of zg is less satisfactory in small samples owing to the inadequacy of the 
expression (6) for var (rg)*. It should be noted, however, that, just as in using the z-trans- 
formation for r,,, a knowledge of the precise expectation of the transformed variable is 
not necessary in a number of the test procedures that become available. 

(B) The results in A can clearly be extended to a much wider class of parental distribu- 
tions. If we start from a bivariate normal distribution of x, y and introduce new variates 
X = f(x), Y = g(y), the rankings of X and Y will clearly be identical with those of x and y 
provided the functions f and g are monotonic. Thus, the simple results under A will also 
apply to rankings generated by the wider class of bivariate distributions of X, Y. Conver- 
sely, starting from any bivariate distribution ¢(X, Y) we can always find monotonic 
transformations X = f(x), Y = g(y) to standardized normal variates x and y. The resulting 
bivariate distribution y(x, y) will not necessarily be bivariate normal, but we think it likely 
that in practical situations it would not differ greatly from this form.} This is a field in 
which further investigation would be of considerable interest. 

(C) Summarizing the resultsof A and B, we may state that if the rankings are generated 
by one of a wide class of distributions of paired variablest X, Y, then the z transforms of the 
rank correlation measures zg, 2x can be regarded as normal variates with variances depen- 
dent only on the sample size, and given approximately in equations (10). Further, within 
this class of bivariate populations, either of the z transforms is an unbiased estimate of 
a function of the correlation p. This is the correlation in the bivariate distribution obtained 
after distortion to normality of the marginal distribution of X and Y. p may be regarded 
as a non-parametric measure of dependence. Without the need to specify p, simple tests of 

* The approximation to the expectation of z contains the variance of r; see equation (13) below. 

+ Johnson (1949) considered a particular case of surfaces having the property of being convertible 


into the bivariate normal form through the application of his Sz and Sy transformations to the marginal 
distributions. 


¢ It is of course realized that other models of non-parametric dependence have been suggested in 


which the ranks are not generated by a parental bivariate distribution. Such models are not considered 
here. 
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significance may be applied to the z values to determine whether two or more samples are 
likely to have come from populations with a common p. 

(D) Within these conditions it is possible to make approximate comparisons of the 
relative merits of the rank coefficients rg and rx (and later we hope of r,). In particular, 
we may compare their power in detecting differences in population p values. 


2. THE EXPERIMENTAL DISTRIBUTIONS OF SPEARMAN’S AND KENDALL’S COEFFICIENTS 
(2-1) The distributions of rg and rg 

The experimental sampling made full use of the 25,000 sets of correlated normal deviates 
referred to in § 1-1. Thus, we had 2500 samples with n = 10, 833 with n = 30 and 500 with 
n = 50. For each value of n we had samples from nine bivariate normal populations, namely 
those with p = 0-1(0-1)0-9. The samples of 10, 30 and 50 were independent in the sense 
that the 25,000 cards containing the basic data were re-shuffled between each of the three 
experiments. The basic calculations for our study were all carried out in the Mathematics 
Division of the National Physical Laboratory. The samples were formed and ranked on the 
Division’s punched card installation under the supervision of Miss M. U. Thomas. She was 
responsible, also, for the calculation of all the values of S, (of equation (1)) and S, (the 
numerator on the right-hand side of equation (4)) and for that of Px (of equation (3)) for 
samples of size 10. The values of P, for samples of sizes 30 and 50 were obtained on the Deuce 
digital computer by Mr T. Vickers and Mr B. W. Munday. An account of the methods used 
will be given in a later paper; we plan also to print the observed frequency distributions 
corresponding to the various coefficients. 

Comparison of the observed mean values of rg with the theoretical values of equation (5) 
and of the means and variances of r, with equations (7) and (8) is only useful as a check on 
the representative character of the random samples. This check has been made and passed 
satisfactorily; the observed values are not reproduced here. Examination of the variance 
of rg is however necessary, equation (6) giving an approximation only to the true value. 


(2-2) The variance of rg 
The Kendall formula (6) does not give the correct values of 1/(n—1) and 0 to var (rg) 
when p = 0 and 1, respectively. A purely empirical adjustment is obtained by substituting 
n—1 for n as divisor and adding a term + 0-019785p!" which reduces the variance to zero 
when p = 1, so that we have 


var (7g) = —— {1 — 1-563465p? + 0-304743p4 + 0-155286p°® 
+ 0:061552p8 + 0-022099p" + 0-019785p"}. (11) 


Table 1 contains for each of the three sample sizes, (a) the estimated variance from equation 
(11), (6) the observed variance from the sampling experiment, (c) smoothed values of (0) 
obtained by a rough graphical process. These last values are made use of in § 3-2 below. It 
will be seen that for n = 10, the modified Kendall formula (11) gives values which for 
p>0-3 are consistently smaller than the observed values. The theoretical approximation 
is also too small, but less noticeably so, when n = 30. It seems clear that for small samples 
var (rg) cannot be accurately expressed as the product of a function of n and a function of p. 
Below, when approximating to the variance of zg = tanh—!rg we have therefore used the 
smoothed observed values of var (7g) taken from the third columns of Table 1. 
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Table 1. Variance of Spearman’s rg 








From ‘Smoothed From Smoothed| From Obs Smoothed 
(11) E obs. (11) : obs. (11) ’ obs. 








0-1 0-1094 0-1061 | 0-1093 0-0339 0-0334 0-0342 0-0201 0-0192 0-0203 
0-2 0-1042 0-1041 0-1055 0-0323 0-0338 0-0336 0-0191 0-0215 0-0200 
0-3 0-0958 | 0-1002 0-1002 0-0297 | 0-0321 0-0317 0-0176 0-0192 0-0183 
0-4 0-0843 0-0916 0-0923 0-0261 0-0263 0-0273 0-0155 0-0160 0-0165 
0-5 0-0701 0-0801 0-0805 0-0218 0-0227 0-0225 0-0129 0-0133 0-0135 


0-6 0-0539 0-0638 0-0644 0-0167 0-0181 0-0172 0-0099 0-0110 0-0105 
0-7 0-0366 0-0443 0-0470 0-0114 0-0117 0-0117 0-0067 0-00604 | 0-0065 
0-8 0-0199 0-0322 | 0-0303 0-0062 0-00674 | 0-0067 0-0037 0-00348 | 0-0035 
0-9 0-0062 0-0125 | 0-0130 0-0019 0-00241 | 0-0024 0-0011 0-00111} 0-0011 






































3. THE TRANSFORMATION OF THE RANK CORRELATION COEFFICIENTS 


(3-1) The transformation and its justification 
Our object is to find transformations of rg and rz which will give variances approximately 
independent of p and will at the same time make the distributions roughly normal. The 
basic distributions of rg and rz become increasingly skew as | p|—>1. It is natural that we 
should consider R. A. Fisher’s z transform which proved so successful in the case of the 
product moment correlation coefficient r,,, in normal samples. If we write in general 


z= tanh™r = blogs", (12) 


then the cumulants of z may be expanded in series in terms of the cumulants of r. The 
leading terms of the expansions for the mean and variance of z are given in equations (13) 
and (14): 








6(z) = blogs *7 + ch sits (13) 
var (z) = achat nee (14) 


where 7 = k,(r) = &(r). 

The distribution of r depends only on the single parameter p and it will be seen that to 
a first approximation the z transformation may be expected to stabilize the variance of a 
statistic r if the ratio of var (r) to (1 —7*)? is independent of p, or nearly so. We have given 
these ratios in Table 2, 7s, 7, and var(r;) being obtained exactly from equations (5), (7) 
and (8), respectively, and for var(rg) we have used the smoothed observed values from 
Table 1. The ratios are least constant for n = 10 where, in particular, there is a definite 
increase for p = 0-9. Further useful comment must await the calculation of «,(7) and x,(r) 
and a fuller study of the expansions for the cumulants of z, but in the meantime we have 
felt no hesitation in going further with the use of the z transforms. 
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Table 2. First approximations to the variance of zg and zz 






































| 
var (rs)/(1—7§)?* var (rx)/(1—7_)*t 
> ~ 
n=10 n = 30 | n = 50 n=10 n = 30 n= 50 

0-1 0-111 0-035 0-021 00618 | 0-0166 0-00952 
| 02 0-112 0-036 0-021 0-0619 0-0166 0-00950 
| O38 0-115 0-037 | 0-022 0-0622 0-0166 0-00947 
| 04 0-120 0-037 | 0-022 0-0627 0-0165 0-00943 
OS 0-124 0-037 | = -0-022 0-0634 0-0165 0-00937 
| 
} 
| 06 0-126 0-037 0-023 00644 | 0-0164 0-00930 

0-7 0-130 0-038 | 0-022 00662 | 0-0164 0:00920 
| 08 0-141 0-039 0-022 00697 | 0-0164 0-00910 
| 0-9 0-155 | 0-048 0-022 00787 | 6-0168 0-00905 
| a | Aas 
| | | | 

Mean 0-1224 00370 | 0-0219 006404 | 0-:01649 | 0-00936 
|p= 0-1-0-8 | | 
| 








* var (rs) obtained from smoothing the experimental values. 
+ var (rx) is the correct theoretical value. 


Frequency tables of the distributions of z, and zx have been obtained and the following 
sections are concerned with comments on the mean values and variances obtained from 
these tables and with the normality of the distributions. 


(3-2) The mean values of zg = tanh rg and zg = tanh" rz 
In Table 3 we compare 


(a) the observed mean values of zg found from the experimental data; 
(6) the approximate theoretical value of &(z,g) given by the first two terms of (13), namely 


6 (zg) = tanh-17,+ 7g var (rg)/(1 —7%)?, (15) 


where 7, is calculated from (5) and var (rg) is thesmoothed observed value already referred to ; 

(c) the second or ‘corrective term’ from the right-hand side of (15). 

Owing to the fact that in a few samples of 10, the rankings of the two variates were in 
perfect agreement, some values of rg (and r;) are unity and the corresponding zg (and zz) 
become infinite.* The means and variances tabled omit these observations which in any 
case form a very small part of a distribution of 2500 observations. We first, however, made 
estimates of the mean and variance of z, using the technique for a censored distribution, but 
the difference in results was not large enough to be of importance. Having regard to the 
standard errors quoted below the tablet it will be seen that the differences between obser- 
vation and approximate theory are not significant except perhaps in the case of p = 0-9. 
The corrective term is of some importance in small samples with large p, but is steadily 

* This happened in seven cases for p = 0-9, in three cases for p = 0-8 and once for p = 0-7. 


+ The standard error of Z is o,/,/N, where the averaged values of o, given below Table 5 for zs and 
Table 6 for zx have been used and N = 2500, 833 and 500, respectively. 
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reduced in importance as n increases. In the case of the transformed product moment 
correlation coefficient a similar, if less important, effect is present. Table 4 gives similar 
results for the mean values of z,, except that in this case the true values of var (r,) may be 


Table 3. Mean values of zg 












































| n= 10 n = 30 n = 50 | 
—— eens. vee 
. | | | | | 
| Approx. | Corr. | Approx. | Corr. | Approx. | Corr. | 
| oe. | theory | term Cts, theory | term mj theory term 
| Se eek ce Ae Ne e Ree 
0-1 0-094 | 0-097 | 0-010 0-097 | 0-096 0-003 0-096 0-096 0-002 
0-2 0-195 | 0-195 | 0-019 0-195 0-194 0-007 0-191 0-194 0-004 | 
0-3 0-304 0-299 0-030 0-297 | 0-296 0-010 0-297 | 0-296 0-006 | 
0-4 0-416 | 0-409 | 0-042 0-410 | 0-405 0-014 0-406 0-405 0-008 | 
0-5 0-526 0-529 | 0-055 0-517 0-525 0-017 0-522 0-525 0-010 
| 0-6 0-671 0-665 | 0-068 0-661 | 0-661 0-021 0-665 | 0-663 0-013 
| 0-7 0-842* | 0-825 0-082 0-833 | 0-826 0-025 0-838 0-828 0-014 
0-8 1-:032* | 1-038 | 0-103 1-051 1-043 0-030 1056 | 1-047 0-016 
0-9 1-374* 1-361 | 0-131 1-406 | 1-389 0-038 1-417 | 1-399 0-019 





Approximate theory = tanh-'7s +7 var (7'5)/(1 —7§)*, using 7s from equation (5) and the smoothed 


observed var (7's). 


Corrective term = second term in expression for approximate theory. 


Standard errors of observed means: for n = 10 about 0-008; for n = 30, 50, about 0-007. 


Table 4. Mean values of zx 























| n= 10 n = 30 n= 50 
| a Va er | mf | | | | 
Approx Corr. | Approx | Corr. Approx.| Corr. | 
| ae | theory | term — | theory term Che. theory | term | 
| | | 
oo a! = — |——— tin: = os si ices 
| 0-1 | 0-065 | 0-068 | 0-004 | 0-066 | 0-065 | 0-001 | 0-065 | 0-064 | 0-001 | 
0-2 0-135 | 0-137 0-008 0-131 | 0-131 | 0-002 0-128 0-130 0-001 
| O03 0-209 | 0-209 0-012 0-199 | 0-200 | 0-003 0-199 0-198 0-002 
| O-4 0-289 | 0-285 0-016 0-275 | 0-273 | 0-004 0-271 0-271 0-002 
| O-5 0-361 | 0-368 | 0-021 0-346 | 0-352 | 0-005 0-346 0-350 0-003 | 
| | | 
| 0-6 0-465 | 0-462 | 0-026 0-439 | 0-442 0-007 0-439 0-439 0-004 | 
0-7 0-582* | 0-574 | 0-033 0-551 0-549 0-008 0-550 0-545 0-005 
0-8 0-717* 0-719 0-041 0-692 0-688 | 0-010 0-689 | 0-684 0-005 
0-9 0-961* 0-949 | 0-056 0-917 | 0-905 | 0-012 0-909 | 0-899 0-006 
| | | 








Approximate theory = tanh-!7, +7x var (rx)/(1 —7r)*, where 7x and var(rx) are derived from. 
equations (7) and (8). 


Corrective term = second term in expression for approximate theory. 


Standaed errors of observed means: for n = 10 about 0-0055; for n = 30, 50 about 0-0045. 


* Ignoring the few infinite values. 
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Table 5. Observed variance and standard deviation of zs 
r THOTMS OT We | 
Variance S.D. 
p | | 
n= 10 n= 30 n= 50 n=10 | n=30 | n=50 | 
| 
Pe "5 ive | €, hav ila @ idk’ Ping We, | | aes = 
| 0-1 0-1380 | 0-0365 0-0204 0-371 0-191 | 0-143 
0-2 0-1407 0-0378 0-0239 0-375 0-194 0-155 | 
0-3 01473 | = 0-0405 0-0238 0-384 | 0-201 | 0-154 
0-4 60-1528 0-0374 0-0226 0-391 0-193 0-150 
0-5 0-1537 0-0407 90-0230 0-392 | 0-202 0-152 | 
| | | 
| 0-6 0-1507 0-0389 | 0-0246 0-388 0-197 0-157 | 
0-7 0-1423* 0-0380 0-0213 0-377 0-195 0-146 =| 
0-8 0-1643* 0-0409 0-0227 0-405. | 0-202 0-151 
| 0-9 0-1700* 0-0465 0-0235 0-412 | 0-216 0-153 
‘eS bois talib decent | er ee ne ee 
re 0-14872 0-03884 0-02279 0-385 | 0197 | 0-151 
p = 0-1-0-8 
RS: RE RR aE ele |. 
| | 
| 1-0296 | 
= 0-389 | 0-198 0-150 | 
| (n—3) 
| | 
Standard errors of s.D.: n = 10, 0-0055; n = 30, 0-0049; n = 50, 0-0047. 
Table 6. Observed variance and standard deviation of zx 
Variance S.D. 
p ———_—— my ; ——— 
n=10  n=30 | n=50 n=10 | n=30 | n=50 | 
| | 
Hef ef 
0-1 0-06830 0-01700 0-00933 0-2613 0-1304 0-0966 | 
0-2 0-06884 0-01758 0-01074 0-2624 0-1327 00-1036 | 
0-3 0-07290 0-01830 0-01049 0-2700 0-1353 0-1024 | 
0-4 0-07446 0-01639 0-00991 0-2729 0-1280 0-0996 | 
0-5 0-07443 0-01712 0-00966 0-2728 | 0-1308 0-0983 | 
06 0-07384 0-01628 0-00985 0-2717 | 0-1276 00992 
0-7 0-06949* 0-01514 0-00822 0-2636 | 0-1230 0-0907 | 
0-8 0-08126* 0-01551 0-00824 0-2851 0-1245 0-0908 | 
0-9 0-08910* 0-01712 0-00780 0-2985 0-1308 0-0883 | 
ee | | —! 
| | 
Mean, | pee , | 
| p = 0-1-08 0-07294 | 0-01667 0-00956 0-2700 0-1290 0-0977 | 
ST Gr bans Shan Bane ONT paca ater ok Seed ae 
| . 51 
: =. 0-2699 | 0-1297 0-0975 
v(n—4) 











Standard errors of s.D.: n = 10, 0:0038; nm = 30, 0-0032; n = 50, 0-0031. 


* Tgnoring the few infinite values. 
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derived from equation (8). Again, the differences between observation and the approxima- 
tion appear only to be significant when p = 0-9. The corrective term is important for n = 10 
and p large; it is smaller in proportion than the corresponding term in the approximation 
to mean Zg. 
(3-3) The variances and standard deviations of zg and zx 

Tables 5 and 6 contain the observed variances and standard deviations for the trans- 
formed variables*. Comparison with Table 2 shows that the first term in the expansions for 
var (zg) and var (zx) is definitely not adequate when n = 10 and still somewhat in defect 
for the larger samples. These points are brought out by a comparison of the mean values 
given at the bottom of the tables for the eight cases p = 0-1 to 0-8. Apart from the extreme 
case with p = 0-9, the change in the variance of z with p is not very great. We shall not 
attempt now to discuss the changes further. The figures, however, suggest that for most 
practical purposes if p < 0-8 it will be justifiable to assume a constant variance for z, for any 
given sample size not greatly exceeding 50. The expressions given below are not, however, 
to be regarded as asymptotic results. 

Assuming that we may use the observed mean values given at the bottom of Tables 5 
and 6 we may then look for a general empirical expression for the variance of the form 


var (z) = a/(n—D), 


where 6 is an integer. We suggest the use of the following: 
For Spearman’s coefficient 








1-060 1-03 
var (zg) = revees “" ae 
For Kendall’s coefficient 
0-437 0-66 
vara) = aoa? = Tina) 


The resulting approximations for the standard deviations of zg and zg when n = 10, 30 
and 50 are given at the bottom of the right-hand side of Tables 5 and 6 where they may be 
compared with the individual sampled values and the means of the latter for p = 0-1 to 0-8. 
We think that except for p> 0-8, the approximation can be safely used in tests of signi- 
ficance for 10 << 50, provided, of course, that the underlying conditions discussed in § 1-3 
are applicable to the data. 


4. NORMALITY OF THE z DISTRIBUTIONS ; PRELIMINARY COMMENTS 


In the case of n = 10 three difficulties arise in examining the fit of a normal curve to the 
experimental distributions. In the first place, as mentioned above, in a few samples there 
was complete agreement between the two rankings so that rg and rx were unity and zg 
and zg were consequently infinite. These observations have to be omitted in calculating 
the moments of z; alternatively, moments could be estimated using the technique for 
dealing with truncated observations. Secondly, while possible values for rg and rx are 
equally spaced, the possible values of zg and zx occur at intervals which increase with the 
z value. For n = 10, where the number of permissible values is relatively small, it is a little 
difficult to know what criterion of normality to adopt. Finally, the distributions of rg 
exhibit, particularly for low values of p, the ‘saw-edged’ character noted by Kendall 
(1948, p. 47) in the case of independence. These factors all make it difficult to know how to 


* For n= 30, 50 the variances tabled are m,, but for n=10 they are k, =m,N/(N—1). 
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assess the importance of excessive values of x, when comparing the observed distributions 
with fitted normal curves. Although we have not yet available all the values of £,(z), it 
appears, as in the case of the z transform of the product moment correlation coefficient, 
that the z distributions are somewhat leptokurtic (f, > 3). 

For n = 30 and 50 we have fitted a certain number of normal curves to the z distributions. 
The result of applying the x? test for goodness-of-fit is summarized in Table 7. Apart from 
three values of y? which are over 30, the fits appear very reasonable. It is clear that the 
matter needs further investigation, but we doubt whether even in samples as small as 10, 
the assumption of a normal z distribution will lead to any serious misinterpretation of a 
significance test. 


Table 7. Normal curve fits to observed distributions of zg and zx 





























Total | 216-8 195 | 1716 | 177 | 82:9 | 62 


| | 
zs; n= 30 zs; n = 50 | zx; n = 30 | zx; n = 50 
| | | 
im | | | | | 
a Ae Ree Lae gat | D.F. x? D.F. 
| | | | 
recap | Burge % | rial eee 
O1 | 2112 | 2 163 | 19 —- | — in fm 
02 | 329 22 206 | 20 —- | — as ven 
os | 378 | 38 | 201 20 =) — — 
<n ae Sa le ee. cae dee o a 
0-5 | 287 | 22 | 131 | 20 | 221 | 21 12-9 21 | 
| | | 
oe Pom hho Bo may met |S ile as lhe 
0-7 16-6 | ie old Ed ees z_ — | = 
08 21-9 22 | 156 | 19 20 | 20 | 147 | 2 
09 | 230 | 23 | 159 | 20 | 348 | 21 | 228 | 19 
| 
a | | ae a a | | 
| | | | 
| | 499 | 61 
| | 





5. THE SENSITIVITY OF THE CORRELATION MEASURES TO CHANGES IN p 


In broad terms the power of discrimination of any one of the possible correlation measures 
depends upon the rapidity with which its sampling distributions draw clear of one another 
as the population p changes. If, for example, for a given value of n, the distribution of rg 
for p = 0-2 does not sensibly overlap the distribution for p = 0-8, then if a single sample of 
nis drawn from each population a test of significance will always establish a difference in 
population p values. The amount of overlap can of course be seen most directly in the 
distributions of rg and rx (or Sg and Px) which we hope to publish later. 

If the distributions of the z’s were normal with a standard deviation o, which is fixed for 
a given sample size, the efficiency of discrimination would depend on the way in which the 
scale of mean z expressed in standard measure (i.e. &(z)/a,) opened out as p is increased from 
0 to 1. Without assuming a constant o,, we can obtain a rough measure of local sensitivity 
by calculating the ratios (2, —2,)/,/(s?, + 82,) of 

(a) the differences between pairs of consecutive observed means given in Table 3 (or 4), to 

(b) the square roots of the sum of the corresponding pair of observed variances from 
Table 5 (or 6). 


31 Biom. 44 
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These ratios are given in Table 8 for both Spearman’s and Kendall’s z.We have also given 
corresponding ratios for the product moment correlation coefficient, taking &(z) and o? 
from the full Fisher expansions as corrected by Gayen (1951, p. 236). Having regard to 
sampling fluctuations, it is clear that we cannot establish any difference in sensitivity 
between the two rank coefficients for n = 10. At x = 30 and 50 and for p > 0-6 the ratio for 
2x is consistently larger than for zg, which suggests a possible advantage for Kendall’s 
coefficient. More detailed examination of this point is however needed. It will be noted, 
as expected, that the product moment coefficient is throughout more sensitive to changes 
in p than either of the rank coefficients. In all cases for a given difference in p, the power 
of discrimination increases with p. 


Table 8. Sensitivity ratios (z, —Z.)/,/(82, + 82,) for different coefficients 























| n=10 n=30 n=50 

| 
Pr Po | i Pies Gregan te | | ee. 

| Pp 

p euamaes Spearman Kendall Feetuss |Spearman) Kendall Puetant |\Spearman Kendall | 

| moment | moment | | moment | | 

Sere —|— Se seolaii spel Seve | _| 

| | | 
0-1, 0-2 | 0-205 | 0-191 | 0-188 0-383 | 0-359 0-352 0-501 0-452 | 0-447 | 
0-2, 0-3 0-214 0-202 | 0-198 0-399 | 0-362 | 0-359 0-523 0-489 | 0-484 | 
0-3, 0-4 | 0-228 _ 0-206 | 0-208 0-427 | 0-405 | 0-406 0-558 0-503 0-508 
0-4, 05 | 0250 | 0-198 | 0-188 0-469 | 0-384 | 0-387 0-615 | 0-542 | 0-538 

| | | 
0-5, 0-6 | 0-286 0-263 | 0-270 0-537 0-508 | 0-508 0-704 | 0-655 | 0-664 
0-6, 0-7 | 0-345 | 0315 | 0-308 0-649 | 0-622 | 0-636 0-851 | 0-309 | 0-828 
0-7, 0-8 0-456 0-345 0-348 0-861 0-776 | 0-805 1:13 | 1:04 | 1-08 
0-8, 0-9 | 0-732 0-592 | 0-592 1:39 | 1:20 | 1-24 1-82 | 168 | 1-74 








Concluding remarks 


Besides putting on record the basic sampling distributions we hope in a further paper 
to carry our investigations further in a number of directions, in particular to give parallel 
results for the coefficient r, of equation (4). 


We should like to express our great indebtedness to Miss M. U. Thomas and Mr T. Vickers 
whose work has already been mentioned, to Mrs Esmé Hill, formerly of the Statistical 
Advisory Unit, Ministry of Supply, and to Mrs Maxine Merrington and Miss Janet Hall of 
University College London. Finally, we should like to say how much we owe to the co- 
operation of the Mathematics Division of the National Physical Laboratory for the facilities 
which made this investigation possible. 
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THE TWO-SAMPLE t-TEST BASED ON RANGE 


By P. G. MOORE 
University College London 


1. INTRODUCTION 


A test for the significance of a difference in means, using an estimate of standard deviation based on 
sample range, was suggested by Lord (1947) as a quick substitute for Student’s t-test in the case where 
the two samples are of equal size. Thus, in Table 10 of his paper Lord gives six significance levels for the 
ratio | Z, —%, |/4(w,+w,), where Z,, %, are the means and w,, w, the ranges in samples of size n from two 
normal populations having a common standard deviation, o. The null hypothesis is that the population 
means, /J;, //2 are equal. 

This test has been in considerable use as a quick method of detecting differences in means, particularly 
in the industrial field. In a discussion which took place at a weekend conference of the Industrial 
Applications Section of the Royal Statistical Society, held in July 1956, speakers asked whether it 
would be possible to provide a similar quick test for the case where the sample sizes n,, n, were not equal. 
The present paper gives an answer to this question. 

When n, +72, a number of points need special consideration. 

(a) The most efficient range estimate of o is no longer based on the unweighted mean of the two 
sample ranges. If for practical convenience we decide to use 0 = w,+w, as our estimating statistic 
rather than ¢ = w,+/(n,, 9) w, (where f is an appropriate weighting function), it is desirable to know 
what loss of power is involved. This will be considered in § 4, where it is shown that the loss is small. 

(b) In Table 2 we give, however, two functions which enable the more accurate estimate of o based 
on ¢ to be obtained if desired. 

(c) With n,+n, there is no longer any point in using Lord’s mean range and we shall take as our 
test ratio Be ee 
a | %,—,| 

Wy +W, 


u (1) 
Table 1 printed at the end of this paper gives values of wu corresponding to four different significance 
levels, «, used in a two-tailed test. 

(d) We have provided a table for all combinations of n,, n, in the range 2 to 20. Computation by 
quadrature as used by Lord in the case n, = n, would now be very heavy and it was therefore decided to 
use an approximation. We have adopted the type of approximation first suggested by Patnaik (1950) 
and have taken w,+w, to be distributed as a multiple of y with appropriate ‘equivalent’ degrees of 
freedom v depending on n, and n,. Following this approach it is possible to represent the distribution 
of the ratio u of equation (1) in terms of Student’s ¢ with modified degrees of freedom which are less than 
N, +N». The accuracy of this approximation is examined in §5. We shall first give two examples illu- 
strating the use of Table 1. 


2. ILLUSTRATIONS 


Example (i). Two operators, A and B, make determinations of the percentage of ammonia in plant 
gas, with the following results: 
Operator A 39 35 43 32 36 48 33 33 
Operator B 43 44 56 63 46 


From these figures can it be said that the two operators are consistently measuring the same thing? 
The data give 
%, = 37-375, w,= 16, DX (x,,;—%,)? = 221-875; 
a 


Z,= 50-4, w,= 20, LX (a_—%,)* = 305-2. 
v 


_ 50-4—37-375 


Hence —__——_— 
16+ 20 


= 0-362, 





and f 
that * 


The 
both 
Ex 
schoo 
twelv 


Eight 


From 
in we 


Hence 


Using 
24% 

Ift 
signif 
paste 


Wher 


does 1 
be shc 


will p: 


d,, anc 
havin; 


Table 


As we 
signifi 
prefer 
denon 
Ng COI 


of 


an 


ant 





P. G. Moore 483 


and from Table 1 with n, = 5 and n, = 8, u is just beyond the 1 % level of significance, remembering 
that we are using a two-tailed test here. The usual form of t-test gives 


= 50-4— 37:375 = 3-301. 


1 F 1\ (527-075 
8 5 11 
The degrees of freedom, pv, are 11 and ¢ is again just beyond the 1 % level of significance. Hence, with 
both tests we would have detected a real difference between the two operators. 
Example (ii). Twelve children aged 12 years to 12 years 3 months were selected at random from a 


school and fed on a special diet of pasteurized milk for 4 months. The gains in weight, in ounces, of the 
twelve children over the 4-month period were 





7, 17, 5638, —2, 27, 41, 37, 35, 10, 12, 9, 38. 
Eight children of similar age were selected as ‘controls’ and not given the diet. The gains in weight were 
10, 0, 29, 11, —21, 25, 19, —19. 


From these data it is desired to investigate whether the pasteurized milk diet results in a greater increase 
in weight over the 4-month period. The figures give 


%, = 23-667, w,= 55, DX (x4,—%,)? = 3202-7; 
i 

Z= 6750, w,= 50, LX (x_;—%,)* = 2485-5. 
a 


_ 23-667 — 6-750 


“eee » = 0-161. 
Hense 50 +55 


Using a one-tailed test, and entering Table 1 with n, = 8, n. = 12, it is seen that u falls just beyond the 
2} % significance point. 

If the t-test is used, we have t = 2-085; with vy = 18 this is a value not quite reaching the 2} % level of 
significance. Both tests therefore show that there is an increase in mean weight associated with the 
pasteurized milk diet and that approximately this is significant at the 24 % level. 


3. THE RANGE ESTIMATOR OF STANDARD DEVIATION 
When the samples are of unequal size, the unweighted sum 
G=w,+U, (2) 


does not provide the best estimate of the assumed common population standard deviation, 7. It may 
be shown (see David, 1951) that 


P = Wi +f(4, M2) We (3) 
will provide the range estimate with minimum coefficient of variation, where 
d,, V 
Hrym) = 72 xg (4) 


d, and V,, being the mean and variance of range in a random sample of size n from a normal population 
having unit standard deviation. Values of d,, V,, and their ratio are given by Pearson & Hartley (1954, 
Table 20). The unbiased estimate of o is then 


_ “thr, My) We (5) 
d,, +f(m, Ns) d,, : 
As we shall proceed to show, the gain from using ¢ rather than @ does not appear worthwhile in a test of 
significance. As there may, however, be situations where the more accurate range estimate of o is 
preferred, we give in Table 2 at the end of this paper values of f(n,,,) from equation (4) and of the 
denominator dy, +f(m,%-) dn, of estimator g of equation (5). The values of n,, n, go from 2 to 20, 
N, corresponding to the larger of the two samples. 
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4, JUSTIFICATION FOR USING THE UNWEIGHTED SUM OF RANGES 


We start by comparing in Table 3 the coefficients of variation of 0 and ¢ for certain combinations of 
n, and n,. It will be seen that only when the samples are of very unequal size is there likely to be any 
marked loss of efficiency. A more direct method of studying the loss of efficiency resulting from the use 
of @ instead of ¢ in a test of significance is to compare the power functions of the corresponding tests, 


Table 3. Ratio of coefficients of variation of 0 and 
























































n, | 
3 5 7 9 
Ng | 
| i 
3 1-000 | — — — 
6 1-030 | 1-002 — — 
9 1-066 | 1-018 1-003 1-000 
~~ 1-097 | 1-036 1-013 1-003 
| 16 1-131 1-060 1-029 1-013 
20 1-158 | 1-080 1-043 1-024 
} | 
Table 4. Power functions of t and modified t-tests 
| 
| Values of (4, — 42)/o 
Sample Type of | 
sizes | test | ] | 
| 1 2 | 3 4 5 
| | 
— re |e eer —| 
m=6 | RMS. 0-244 0-602 | 0-886 0-984 0-999 
mg=10 | wtfuz 0-242 0-596 0-882 0-983 0-999 
| wy + We 0-241 0-593 | 0-879 0-982 0-999 
n, =3 R.M.S. 0-249 | 0-614 0-895 0-987 0-999 
| M2 = 20 w+ fw, | 0-245 | 0-605 — 0-889 0-985 0-999 | 
| | wy tw, | 0-241 | 0-593 0-879 0-982 0-999 | 
| m=2 Rms. | 0210 | 0-504 | 0-791 0-944 0-991 
mg=4 | wit+fu, | 0207 | 0-492 0-775 0-935 0-989 
| W+w, | 0-204 | 0-482 | 0-767 | 0-929 | 0-986 | 
es Ss | 





As will be suggested below a good approximation to the distribution of 0, and presumably also of ¢, 
may be obtained from the y-distribution. From this it follows that the distribution of the test ratio u 
of equation (1) and that of a similar ratio using ¢ instead of 6 will follow t-distributions if the population 
means //, and /, are equal and non-central ¢-distributions if 7, +. The ‘equivalent’ degrees of freedom 
using ¢ will be less than the standard value v = n, +n, — 2 of Student’s test, and the degrees of freedom 
using @ will be somewhat less than those using ¢. What this means in terms of power may be found using 
the tables of the power function of the t-test given by Neyman (1935) for the one-tailed test. Results of 
this comparison for a 5% significance level test are shown in Table 4, using the appropriate 
‘equivalent’ degrees of freedom for the 9 and ¢ estimates in the manner outlined in the following 
section. It is seen that while both forms of test based on range are less powerful than the standard test 
using the root-mean-square estimate of o, the difference between the 0 and ¢ forms is not large. 

We conclude that if a range test is required, then the simpler form involving the straight sum 
0 = w,+w, may be used since the amount of power sacrificed is negligible. 
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5. CALCULATION OF SIGNIFICANT POINTS 
Before making extensive use of Patnaik’s ;-approximation it was decided to test its accuracy in one case, 
with n, = 4, n. = 16, by also making a full evaluation by quadrature. 
(i) Quadrature 


It is not necessary to describe the method used in detail. Use was made of the values of the ordinates 
of the distributions of range required, kindly supplied by Mr Lord, who had previously derived them 
from the basic 5-decimal-place tables of the probability integral of range used by Pearson & Hartley 
(1942). As a result, we could find the exact significance points of a normal variable divided by an estimate 
of its standard error based on the sum of two independent ranges. 


(ii) The x-approximation 


Patnaik’s method leads us to identify the first two moments of (w, + w,)/o, where o is the population 
standard deviation, with the first two moments of cy/,/v where y has v degrees of freedom. The first two 
moments of (w, +w,)/o are 


M=d,,+d,,, V=Vn, +V,,, (6) 


and these are equated to those of cy/,/v to give 


M = av? r(*=*) /r(3), (7) 


ro Tete) ° 


Expanding the gamma functions by Stirling’s formula we obtain 





V 1 1 1 
=— = — +} — +... 9 
° M2 >" 8p? les * (9) 


As a first approximation y-l = —242,/(1+ 2k) (10) 


and a better approximation leads to 
p-t = —242,/(1+2)), (11) 


where j = k + 1/(16v%) the value of v from (10) being used for 7. The expression for c to be the same order 
of accuracy is 

=uli+ + sani - ial (12) 

7 4y " 328 12808)" 
Hence, if x is a unit normal variable the distribution of «(dy +d,,)/(wW,+W2) can be approximated to 
by Mt/c, where the ¢-distribution has the ‘equivalent’ degrees of freedom from (11) above. The per- 
centage points can be found by interpolating for fractional v in a table of the percentage points of the 
t-distribution and multiplying by the appropriate factor. 
For the case where n, = 4 and n, = 16 the percentage points calculated by the two methods are: 











2S —)— = 
: a 

a (two-tailed) Quadrature | y-approximation | 

| 

0-10 1-744 | 1-746 
0-05 2-130 2-136 

0-02 2-615 2-628 | 
0-01 2-971 2-995 | 





A comparison of these results with those given by Patnaik (1950, p. 81) for equal sample sizes shows that 
the y-approximation is as good for the unequal sample sizes considered here as it was for the equal sample 
sizes. With this check it was felt safe to proceed with the calculation of the percentage points using the 
X-approximation. 
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6. CALCULATION OF TABLE 


For each of the 190 pairs of sample sizes, where n, and n, go from 2 to 20, the following quantities were 
calculated: 


(i) M and V from d,, and V,, as in (6), 
(ii) 1/v from (10) and (11), 
(iii) c’ = c/M from (12). 
For each value of v found from (ii) the percentage points of ¢ for a = 0-10, 0-05, 0-02 and 0-01 were 
obtained by interpolation in the ¢ tables. Call these valuest, ,. The required percentage point will then be 


ta) 
tre | gig a> 3 
: My Ne (13) 


. 


eda, +p,) 


A subsidiary table of (nz 1+nz1) was computed and w, finally obtained from (13). The values are given 
in Table 1 to three places of decimals and it is thought that even in the worst cases, that is when both 
samples are small, or the samples are very unequal in size, the error should not exceed 3 in the third 
decimal. As a check the values were differenced for n, given n, and for n, given n., and in the particular 
case where n, equals n, the values can be checked against the values given by Lord (1947, Table 10, 
p- 66). They should be just half Lord’s values since he used }(w,+w,) for his denominator in wu. 

We have not considered in detail modifications of this test which could be obtained, for example, 
by splitting up the sample ranges into the ranges of smaller subsamples. If n, = 8 and n, = 16 it might 
be better to estimate the variability by splitting the second sample up randomly into two smaller 
subsamples, each of eight, giving us three ranges w,, w; and w,. Theexpected gain can be gauged by finding 
the power function for a 5% significance level test as in Table 3. For the three ranges w,+w,+w, 


yh 
en M = 3d, = 854160, V = 3V, = 2-01639, 


and using equations (10) and (11) v is found to be 18-33. If the two ranges are used in the form w,+w, 
as in this paper vy = 16-72, whilst if the root-mean-square estimate of standard deviation is used v = 22. 
The powers from Neyman’s tables are 

















| J | 
plo | 1 | 2 3 | 4 5 
ae A | | 
R.M.S. 0-250 | 0-615 0-896 0-987 0-999 | 
w,+w,+u% 0-248 | 0-611 0-893 0-986 0-999 
Wy +W, 0°247 | 0-607 0-891 0-986 0-999 | 





The extra gain by splitting into subsamples thus seems to be very small and would not appear to be 
worth the very much greater complexity which such a modification would introduce into the test. 
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Table 1. Values of u = |%,—%,|/(w,+we) exceeded with probability a 







































































2 (for a single-tailed test a must be halved) 
l ] 
| Probability («) | | Probability («) 
m Ss Mm | Ms : 
| 0-10 | 0-05 | 0-02 | 0-01 | 0-10 | 0-05 | 0-02 | 0-01 
=] | | | | | | 
| | | 
2 2 | 1161 1-714 | 2-776 | 3-958 4 | 16 | 0-175 | 0-213 | 0-263 | 0-299 
3 | 0-693 | 0-915 | 1-255 | 1-557 | 17 | 0-172 | 0-210 | 0-258 | 0-293 
4 | 0556 | 0-732 | 1-002 | 1-242 | 18 | 0-169 | 0-206 | 0-253 | 0-288 
) 5 | 0-478 | 0-619 | 0-827 | 1-008 | 19 | 0-166 | 0-203 | 0-249 | 0-283 
| 20 | 0-164 | 0-200 | 0-246 | 0-279 
6 | 0-429 | 0-549 | 0-721 | 0-865 | 
7 | 0396 | 0-502 | 0-652 | 0-776 5 5 | 0-247 | 0-307 | 0-387 | 0-450 
" 8 | 0-372 | 0-469 | 0-603 | 0-713 | 6 | 0-224 | 0277 | 0-347 | 0-402 
. | 9 | 0353 | 0-443 | 0-567 | 0-666 | 7 | 0-208 | 0-256 | 0-319 | 0-368 
1 | 10 | 0-338 | 0-423 | 0-538 | 0-630 | 8 | 0195 | 0240 | 0-299 | 0-343 | 
| | 9 | 0-186 | 0-228 | 0-282 | 0-323 | 
r 11 | 0326 | 0-407 | 0-515 | 0-601 10 | 0-178 | 0-218 | 0-270 | 0-309 | 
, 12 | 0-316 | 0-393 | 0-496 | 0-557 | 
13 | 0-307 | 0-382 | 0-480 | 0-557 11 | 0-172 | 0-210 | 0-260 | 0-296 
| 14 | 0-300 | 0-372 | 0-467 | 0-541 12 | 0-167 | 0-204 | 0-251 | 0-286 
15 | 0-294 | 0-363 | 0-455 | 0-526 | 13 | 0-162 | 0-198 | 0-244 | 0-277 
| 14 | 0-158 | 0-193 | 0-237 | 0-270 
r 16 | 0-287 | 0-356 | 0-445 | 0-513 15 | 0-155 | 0-189 | 0-232 | 0-263 
x | 17 | 0-282 | 0-349 | 0-436 | 0-502 | | 
+ 18 | 0-278 | 0-343 | 0-428 | 0-492 16 | 0-152 | 0-185 | 0-227 | 0-257 
2 19 | 0-274 | 0-338 | 0-420 | 0-483 17 0-149 | 0-182 | 0-222 | 0-252 
20 | 0-270 | 0-333 | 0-414 | 0-475 18 | 0-147 | 0-179 | 0218 | 0-248 
| 19 | 0-144 | O176 | 0-215 | 0-244 
. 3 3 | 0-487 | 0-635 | 0-860 | 1-050 20 | 0142 | 0173 | 0-212 | 0-240 
| 4 | 0-398 | 0-511 | 0-663 | 0-814 | | 
7 | 5 | 0-339 | 0-429 | 0-556 | 0-660 6 6 | 0-203 | 0-250 | 0-312 | 0-359 
| | | 7 | 0-188 | 0-240 | 0-287 | 0-329 
; 6 | 0-311 | 0-391 | 0-501 | 0-590 8 | 0177 | 0-217 | 0-268 | 0-307 
7 | 0-288 | 0-360 | 0-458 | 0-536 9 | 0-168 | 0-206 | 0-254 | 0-289 
8 | 0-271 | 0-338 | 0-427 | 0-498 10 | 0-161 | 0-197 | 0-242 | 0-276 
9 | 0-258 | 0-321 | 0-404 | 0-469 
10 | 0-248 0-307 0-385 0-446 11 | 0-155 0-189 0-233 | 0-265 
12 | 0-150 | 0-183 | 0-225 | 0-255 
11 | 0-240 | 0-296 | 0-370 | 0-427 13 | 0-146 | 0-178 | 0-218 | 0-247 
| | 12 | 0-232 | 0-287 | 0-358 | 0-412 14 | 0142 | 0-173 | 0-212 | 0-241 
| 13 | 0-226 | 0-279 | 0-347 | 0-399 15 | 0-139 | 0-169 | 0-207 | 0-235 
| 14 | 0221 | 0-272 | 0-338 | 0-388 | 
| 15 | 0-216 | 0-266 | 0-330 | 0-378 16 | 0-136 | 0-166 | 0-203 | 0-229 
17 | 0-134 | 0-163 | 0-199 | 0-225 
| 16 | 0-212 | 0-261 | 0-323 | 0-370 18 | 0-131 | 0-160 | 0-195 | 0-221 
e 17 | 0-209 | 0-256 | 0-317 0-362 19 | 0-129 | 0157 | 0-192 | 0-217 
| 18 | 0-205 | 0-252 | 0-311 | 0-356 20 | 0128 | 0-155 | 0189 | 0-214 
| 19 | 0-202 | 0-248 | 0-306 0-350 
| 20 | 0-200 | 0-245 | 0-302 0-344 7 7 | 0-174 | 0-213 | 0-263 | 0-301 
| 8 | 0-163 | 0-200 | 0-246 | 0-281 
4 4 | 0-322 | 0-407 | 0-526 | 0-620 9 | 0155 | 0-189 | 0-233 | 0-265 
| 5 | 0282 | 0-353 | 0-450 | 0-528 10 | 0-148 | 0-181 | 0-222 | 0-252 
| | 
| 6 | 0-256 | 0-319 | 0-403 | 0-469 11 | 0-148 | 0-174 | 0-213 | 0-242 
| 7 | 0-237 | 0-294 | 0-370 | 0-429 12 | 0-138 | 0-168 | 0-206 | 0-233 
| 8 | 0-224 | 0-276 | 0-346 | 0-399 13 | 0-134 | 0163 | 0199 | 0-226 
| 9 | 0-213 | 0-263 | 0-327 | 0-377 14 | 0-131 | 0-159 | 0-194 | 0-220 
10 | 0204 | 0-252 | 0-313 | 0-359 15 | 0-128 | 0-155 | 0-189 | 0-214 
11 | 0-197 | 0-242 | 0-301 | 0-345 16 | 0-125 | 0-152 | 0185 | 0-209 
12 | 0-191 | 0-235 | 0-291 | 0-333 17 | 0-123 | 0-149 | 0-181 | 0-205 
13 | 0-186 | 0-228 | 0-282 | 0-322 18 | 0-121 | 0146 | 0-178 | 0-201 
14 | 0-182 | 0-223 | 0-275 | 0-314 19 | O-119 | 0-144 | 0-175 | 0-198 
| 15 | 0-178 | 0-218 | 0-268 | 0-306 | 20 | 0-117 | 0-142 | 0-172 | 0-195 
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Table 1 (continued) 







































































| 
Probability (a) Probability («) 
Ny Ne : : ny Ng | ) 
| 0-10 | 0-05 | 0-02 0-01 | | 010 0-05 | 0-02 | 0-01 
| | | | 
| pen ry | 
8 8 0-153 | 0-187 0-231 0-262 12 | 12 | 0-107 0-190 “ee oats 
9 0-145 | 0-177 | 0-217 0-247 13 0-104 0-12 . “17 
10 | 0-139 0-169 | 0-207 0-235 | 14 | 0-101 0-122 | ois 0-167 
| 15 0-098 0-119 144 0-162 
11 0-133 | 0-162 | 0-199 | 0-225 | 
12 0-129 0-157 0-192 0-217 | 16 | 0-096 0-116 | 0-140 0-158 
13 0-125 0-152 | 0-186 0-210 17 0-094 0-113 | 0-137 0-154 
14 | 0-122 0-148 0-180 0-204 | 18 0-092 0-111 | 0-134 0-151 
15 | 0-119 | 0-144 0-176 0:99 | 19 | 0-090 0-109 | 0-132 0-149 
| | | 90 | 0-089 | 0-107 | 0-130 | 0-146 
16 0-116 0-141 0-172 0-194 
17 0-114 0-138 0-168 0-190 13 13 0-100 0-121 0-147 0-166 
18 0-112 0-136 0-165 0-186 14 0-097 0-118 0-143 0-161 
19 0-110 0-134 0-162 0-183 15 0-095 0-115 0-139 0-156 
20 0-109 0-132 0-160 0-180 aan 
16 0-092 0-112 “135 0-152 
9 9 0-137 0-167 | 0-205 0-233 17 0-090 0-109 0-132 0-149 
10 0-131 0-160 | 0-195 0-221 18 0-089 0-107 0-130 0-146 
19 0-087 0-105 0-127 0-143 
1] 0-126 0-153 | 0-187 0-212 20 0-086 0-103 0-125 0-140 
12 0-122 0-148 0-180 0-204 
13 0-118 0-143 0°175 0-197 14 14 0-094 0-114 0-138 0-156 
14 | O-115 0-139 0-170 0-192 | 15 0-092 0-111 0-135 | 0-151 
15 | 0-112 0-136 0-165 0-187 
| 16 | 0-090 | 0-108 0-131 | 0-147 
16 | 0-110 0-133 0-162 | 0-182 17 0-088 0-106 0-128 | 0-144 
17 0-107 0-130 | 0-158 0-178 18 0-086 0-104 0-125 0-141 
18 | 0-106 0-128 0-155 0-175 19 0-084 0-102 0-123 0-138 
19 | 0-104 0-126 0-152 0-172 | 20 0-083 0-101 0-121 0-135 
20 1 0-124 | 0-150 | 0-16 
ee | : 15 | 15 0-089 0-108 0-131 0-147 
10 10 0-125 0-152 0-186 | 0-210 | 16 0-087 0-105 0-127 pi 
17 0-085 0-103 0-124 140 
ll 0-120 0-146 0-178 | 0-201 | 18 0-083 0-101 0-122 0-137 
12 | 0-116 0-141 0-171 | 0-194 19 0-082 0-099 0-119 0-134 
13 | 0-112 0-136 0-166 0-187 | 20 0-080 0-097 0-117 | 0-131 
14 | 0-109 0-133 0-161 | 0-182 | 
15 | 0-107 0-129 | 0-157 0:177 16 | 16 0-085 | e108 Soo ore 
17 1.083 100 . . 
16 | 0-104 0-126 | 0-153 0-173 ig | --081 0-098 0-118 0-133 
17 | 0-102 0-124 0-150 0-169 | 19 0-080 0-096 0-116 0-130 
| 18 | 0-100 0-121 | 0-147 0-165 | 20 0-078 0-094 0-114 | 0-128 
| 19 | 0-098 0-119 | 0-144 0-162 | 
| 20 | 0-097 0-117 | 0-142 | 0-160 17 | 17 | 0-081 0-098 0-118 oss 
| 18 | 0-079 0-096 0-115 “1 
Wo} Vu 0-115 0-140 | 0-170 0-193 | 19 | 0-078 0-094 0-113 0-127 
| 12 | O-111 0-135 | 0-164 0-185 20 0-076 0-092 0-111 0-124 
| 13 | 0-108 0-131 | 0-159 0-179 | 
14 0-105 0:127 0-154 0:174 is | 18 0-077 0-093 0-113 0-126 
15 | 0-102 | 0-123 | 0-150 | 0-169 19 0-076 0-092 0-110 0-124 
| 20 0-074 0-090 0-108 0-121 
16 | 0-100 | 0-121 | 0-146 0-165 
17 0-098 | O118 | 0-143 0-161 19 | 19 | 0-074 0-090 0-108 0-121 
18 0-096 | 0-116 0-140 0-158 | 20 0-073 0-088 0-106 0-119 
19 | 0-094 | 0-114 | 0-138 | 0-155 | 
| 20 | 0-092 | 0-112 | 0-135 0-152 20 | 20 0-071 0-086 0-104 | 0-116 
| | 
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Table 2. To assist in estimating a standard deviation from the range estimator 
9 = (Wy + f(y, No) We)/(dn, +f(m1, M2) dy.) 
























































| | | 
f d,,+fdn, | ™% Ne fs | dy +fdn, | ™ | Ne | Sf | dy. +fdn, 
i | | | 
| | 
1-378 3-461 5 13 1-804 8-344 10 | 11 | 1-057 | 6-431 
1-713 4-655 14 1-878 8-724 | 12 | 1-110 | 6-694 
2-009 5-801 15 1949 | 9-093 13 | 1-160 6-947 
| 14 ‘1-208 7-192 
2-267 6-873 16 2-016 | 9-446 15 | 1-253 7-427 
2-512 7-920 17 2-080 9-789 | 
2-731 8-903 18 2-142 | 10-123 | 16 | 1-297 | 7-658 
2-931 9-833 19 | 2-201 | 10-445 | 17 | 1-338 | 7-878 
3-117 10-721 20 2-258 10-760 18 | 1-377 8-089 
| 19 1-416 8-301 
3-298 11-592 .; F 1-105 5-522 | 20 | 1-452 8-500 
3-620 | 18-204 Bak da~ - coat 
: . 9 1-291 6-368 1l | 12 1-050 6-594 
3-768 | 13-965 | 10 | 1375 | 6-766 | 13 | 1-098 6-836 
3-910 | 14-703 | " | ont | ie | 14 | 1143 | 7-067 
| 1 . 15 1:186 | 7-291 
4005 | 641s | 12 | 1-526 7-506 | | 
, ‘ 1 ee 1-595 7-855 i6 | 1:227 | 7-507 
4-296 16-766 | 44 1-661 8-193 | 17 | 1-266 | 7-715 
4418 | 17-426 15 | 1-723 8-516 | 18 | 1-303 | 7-916 
"52 ‘ | 19 | 1-340 8-116 
| 16 1-782 8-828 | 20 | 1-374 | 8-805 
fa pad 17 1-839 9-132 | | 
. ° 18 1-893 9-425 12 | 13 | 1-045 6-744 
ie 19 1-946 9-713 | 14 | 1088 | 6-965 
— _—< 20 , 1-996 9-989 | 15 | 1129 | 7-178 
4 : | 
ae i a | 16 | 1168 | 7-383 
ae 8-639 9 | 1-168 6-173 17 1-205 | 7-581 
10 | 1-244 6-532 | 18 | 1-241 | 7-775 
| 19 1-275 | 7-961 
2387 907 | 1314 6-873 | 20 | 1:308 | 8-143 
12 1-380 7-201 
2-620 10-433 | 
2-728 10-987 7 | saan apn 13 | 14 | 1-041 6-882 
. j | 1: , 15 1-080 7-086 
| 2830 | 11-518 15 | 1-558 | 8113 
| 
2.998 | 12-085 | | | 16 | L118 | 7-285 
‘ : 16 1-612 8-398 17 1-153 7-473 
ca | aeaee 17 | 1-664 8-674 ig | 1-187 | 7-657 
3-199 13-494 18 L713 | 8-939 19 1-220 7-837 
19 1-760 | 9-197 | 20 1-251 8-008 
— | 20 | 1-806 | 9-449 | | 
1-171 4-783 i ae ee 7° 
. ‘ 1074 | 6037 14 15 | 1-037 | 7-007 
= : Zia 16 1-073 7-197 
roe e617 1-144 17. | 1-108 7-382 
1-4 -018 | 18 1:140 | 7-557 
| 1-593 | 6-595 Ean: 19 | 1172 | 7-730 
| 711 7-141 12 | 1-269 6-982 
| 20 | 1-202 | 7-896 
| 1-821 7-663 13 | 1-327 | 7-274 
| | he ee ree lt is | 16 | 1-085 7-128 
! | gs. | 15 1-433 | 7-822 . 
| intl | | 17 | 1-067 | 7-300 
2-113 9-108 | 16 | 1-483 | $088 ht Be = 
2-200 9-554 | 17 | 1-530 8-336 20 ho : i 
| 2-283 9-985 | 18 | 1-575 8-580 4 } 
| 19 | 1-619 | 8-819 
2-362 10-402 | 20 | 1-661 | 9-051 16 17 1-032 7-235 
| 2-437 10-803 | 18 | 1-062 7-398 
| 2-509 11-192 9 10 1064 | 6-244 | 19 | 1-092 | 7-560 
2-578 11-569 1l 1-125 | 6-539 | 20 | 1-120 7-715 
| 2-645 11-938 12 1-182 6-821 | 
| 13 1-235 | 7-090 17 18 | 1-030 | 7.337 
| 1-131 5-192 14 1-286 7-351 19 | 1-058 | 7-491 
| 1-250 5-706 15 1-334 | 7-601 20 | 1-085 7-640 
1-360 6-198 | | 
| 1-461 6-665 16 1-380 7-844 | 
| 1-568 7-152 17 | 14o4 | 8.079 = 1 | ‘oe Lp = 
18 1-466 | 830 | | 
| 1-643 7-539 19 1-507 8-529 | 
| 1-726 7-950 20 1-546 8-744 19 20 | 1-026 | 7-521 
| 
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A BIBLIOGRAPHY ON THE THEORY OF QUEUES 


By ALISON DOIG 
Research Techniques Unit, London School of Economics 


In this bibliography an attempt has been made to assemble all papers on those aspects of 
the theory of probability which may be grouped together under the heading of the study 
of queues. In addition to purely theoretical work, the practical applications of the subject 
have received mucn attention, the most important and fruitful of these being in the field 
of telephone traffic. The other main applications have been to the study of road traffic, 
the allocation of operatives to the servicing of machines, the mathematical aspects of 
inventory control and production scheduling, storage problems such as the optimal size 
of dams and miscellaneous topics which include the scheduling of air traffic and the design 
of appointment systems in hospital outpatient departments. The study of point processes 
and, in particular, the problems arising from counts obtained from Geiger-Miiller counters, 
resemble closely problems arising in telephone systems where no waiting is possible (loss- 
systems) and a selection of papers on such processes has been included. Similarly, 
problems arising in the theory of dams and provisioning are closely related to the problem 
of collective risk in actuarial studies. In the view of the large number of papers on the 
latter topic, some of which are specifically actuarial in character, the choice in this case has 
been limited to a few papers only, these being chosen both for their own intrinsic importance 
and for the fact that they themselves provide references to other work. A similar policy 
has been followed for papers concerned strictly with the economics of inventory control 
and also for papers on renewal theory. 

Classification. The papers are listed in alphabetical order by authors and have been 
classified as belonging to one or more of ten main groups. These are: 


Problems dealing with storage (content). 

Problems relating to flow through a network. 

Applications not covered by the other categories. 
Inventory problems. 

Problems arising in servicing automatic machines. 

Point processes and counter problems. 

The general theory of queues. 

Road traffic and related topics. 

Stochastic processes directly related to the study of queues. 
Problems in telephone traffic. 


HaAWOoW EOF OQ 


Within these groups, a further subdivision has been made into theory (t), numerical 
results (including tables and graphs) and practical applications (a), and expository and 
descriptive articles (d). Where it is relevant, a distinction is made between loss-systems (I) 
in which the customer leaves immediately if service is not available at the instant at which 
he first requires it and delay-systems (w) in which waiting is possible. The description of a 
queueing system proposed by D. G. Kendall (Ann. Math. Statist. (1953), 24, 338-54) has 
been given where possible; in some cases the description has been omitted if the paper in 
question deals equally with many different systems or if the necessary information for 
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such a description is lacking. For simplicity, variants of the basic queue discipline 
(service in order of arrival) are grouped together under the letter v; such variants include 
queueing with priorities, random choice of the next customer to be served from amongst 
those awaiting service and queues in series in which the customer receives attention at 
more than one counter before he leaves the system. 

Abstracts and short notes are indicated by the letter A and papers with useful lists of 
references by the letter B. Finally, the papers judged to be the most important either by 
reason of their contents or for their survey of a branch of the subject have been marked 
with an asterisk. 


This bibliography was prepared in the Research Techniques Unit of the London School 
of Economics under the direction of Prof. M. G. Kendall and Dr F. G. Foster. The work was 
supported from a grant made to the Unit by the Department of Scientific and Industrial 
Research. The author wishes to thank all those who were of assistance to her in locating 
papers and in particular Dip. Ing. R. Syski for letting her see a pre-publication copy of the 
bibliography prepared by him and Dr J. W. Cohen, and the Royal Statistical Society for 
permission to cite material to appear in subsequent issues of the Journal of the Society, 
Series B. 
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MISCELLANEA 


Studies in the history of probability and statistics 
VI. A note on the early solutions of the problem of the duration of play 


By A. R. THATCHER 


It is now just 300 years since the publication by Huygens of the first result on the famous problem which 
became known as the Duration of Play. The aim of this note is to summarize the early development of 
this problem and to show how easily some of the solutions found at the beginning of the eighteenth 
century can be linked with modern work on sequential tests, random walks and certain storage problems. 

We use throughout the following notation. Call the two players A and B, and let their chances of 
winning a game be pand g = 1 — 7, respectively. A starts with a counters and B starts with b counters, and 
after each game the loser hancs one counter to the winner. It is desired to find first the probability P, 
that A will eventually lose all his counters without having previously won all B’s, and more generally the 
probability P,,, that this will happen within n games. P, and P,,,, are defined similarly. P,,,+P,,, is 
the probability that the play will terminate (with the ‘ruin’ of one of the players) within n games. It can 
be shown that the play must end sooner or later, so that P,+ P, = 1. 

In 1657 Huygens gave without proof, in the fifth and last problem of his treatise De ratiociniis in ludo 
aleae, the numerical value for P, in a case where a = b = 12 and where p and gq had particular values. 
The general result for P, was found by James Bernoulli, who died in 1705, but it remained in manuscript 
until it was published 8 years later in his Ars Conjectandi; Bernoulli says that the proof is laborious and 
leaves it to the reader. Before the Ars Conjectandi appeared, however, de Moivre had found a simple 
derivation independently and published it in his treatise De Mensura Sortis (1711). 

De Moivre’s original proof, which was later reproduced in his Doctrine of Chances (see 1711, pp. 227-8; 
1718, pp. 23-4; 1738, pp. 45-6; 1756, pp. 52-3), is very ingenious and so much shorter than the demon- 
strations usually given in modern textbooks that it is worth quoting. Its essence is as follows. Imagine 
that each player starts with his counters before him in a pile, and that nominal values are assigned to the 
counters in the following manner. A’s bottom counter is given the nominal value q/p; the next is given 
the nominal value (q/p)?, and so on until his top counter which has the nominal value (q/p)*. B’s top 
counter is valued (q/p)**1, and so on downwards until his bottom counter which is valued (q/p)**". After 
each game the loser’s top counter is transferred to the top of the winner’s pile, and it is always the top 
counter which is staked for the next game. Then in terms of the nominal values B’s stake is always q/p 
times A’s, so that at every game each player’s nominal expectation is nil. This remains true throughout 
the play; therefore A’s chance of winning all B’s counters, multiplied by his nominal gain if he does so, 
must equal B’s chance multiplied by B’s nominal gain. Thus 


ail) +45) ++) AG) + GI 


The use of P,+ P, = 1 now gives immediately 

_ (a/p)?-1 
~ (q/py*— 1’ 
and this is the probability of the ‘gambler’s ruin’. 

In terms of the counters, A’s total expected gain is bP,—aP,, while his expectation per game is p—q. 
These obvious facts are indeed only special cases of a more general result given by de Moivre (1718, 
pp. 135-6; 1738, pp. 48-9; 1756, pp. 55-6). De Moivre does not actually divide one expression by the 
other, but, since the total expectation equals the expectation per game times the expected number of 
games, this division is all that is required in order to get the expected number of games 


bP,—aP, 
P-q 
De Moivre was also the first to discover and publish a general method for calculating Pint Py,» thus 


finding the chance that the play would terminate within n games. For the case where a is infinite (so that 
P,,.n = 0) and n—6 is odd, he found 


(1) 


b 


E(N) = (2) 


P,, » = first }(n—b+ 1) terms of (p+ q)"+ first 3(n—b+1) terms of (p/q)’(¢+p)”. (3) 


¥ 
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This solution, with a similar one for the case where n — b is even, was given without proof in his De Mensura 
Sortis and later in The Doctrine of Chances (1711, p. 262; 1718, pp. 119-20; 1738, p. 179; 1756, pp. 208-9), 
Fieller (1931) has drawn attention to this result and also provided a simple and elegant proof. 

De Moivre’s first solution of the general problem of calculating P,, , + P,,, when both a and 6 are finite 
(1711, p. 261; 1718, pp. 113-14; 1738, stated incorrectly on pp. 173-4; 1756, p. 203) called for the per- 
formance of n— 1 multiplications and the rejection of certain terms during the process. For moderate n 
the calculation is not so tedious as appears at first sight, and it has the advantage of giving the answer 
reduced to the smallest number of terms; as de Moivre later pointed out, the rejected terms can also be 
used to obtain P, , and P, , separately. 

However, a few months before de Moivre’s method actually appeared (for the Philosophical Transac- 
tions for 1711 were delayed in the press), a different solution giving P,,,, and P, , separately had been 
found and was soon published by de Montmort (1713). This result is of particular interest because it 
provides one of the easiest solutions of the problem, since the series which can be derived from it by using 
modern tables is rapidly convergent over the range of values of n where the play is likely to terminate. 

In 1710 de Montmort found a method for calculating P,,,, and P, ,, for the case p = q. He sent some 
numerical results to John Bernoulli, who passed the letter to his nephew Nicholas. In a reply dated 
26 February 1711, published by de Montmort (1713, p. 308 et seq.), Nicholas Bernoulli gave without 
proof the general solution for the case p + ; in modern notation it can be written as follows: 


FF a= >> {png (") (p—b-2ts-tgt +qroseips| 
t i \t 


a > {pir > (") (p—b—2ts—2a—igi eS grr-te-se-tp) " (4) 
t i 


In this formula s = a+b; the summation over 7>0 continues until the terms in the series in each curly 
bracket, re-arranged in descending powers of p, meet in the middle (the middle term counting only once 
if n—b is even); and the summation over ¢ covers all values >0 which leave non-negative exponents 
within the summation over 7 on the line concerned. Bernoulli stated the result for n—6 even, but in 
fact (4) is also valid if n —b is odd. 

Not content with this, Nicholas Bernoulli confirmed that the limit of (4) as m > 00 gives the correct 
value for P,. He does not give his method but it is not difficult to guess; if for example p> q it is only 
necessary to re-write the two lines of (4) as 


Sp-“a'tp" + np"—1q +... + prtstdgn—sst—d] 
= > a +l" + np™—lg + ... + prtst2a+bgn—2ts—2a—d) (5) 


As n- © the sums in each square bracket tend to 1; this follows from (James) Bernoulli’s Theorem, 
which at the time had not been published but which was known to Nicholas. The expression thus reduces 
to two geometric series, and is immediately seen to agree with (1) above. In passing, it may be noted that 
as a-> © the expression (4) reduces to de Moivre’s expression (3). 

When de Montmort saw this extraordinary solution he admitted that he could not follow it (this was 
partly because Bernoulli had inadvertently used one symbol in two senses), and remarked: ‘votre 
formule m’étonne pour sa generalité’ (1713, p. 316). Later, in comparing it with his own, he said: ‘je n’ai 
eu en vie que la supposition des hazards égaux pour l’un et pour l’autre Joueur, au lieu que vous les 
supposés dans un rapport queleonque’ (1713, p. 345). De Montmort’s solution, which he then describes 
briefly, consisted of a method of picking out the binomial coefficients in (4) from Pascal’s triangle; this 
was of course sufficient when p = q, and was in itself a remarkable result to have found. Nevertheless, it 
seems clear that the solution (4) of the general case p +q, though often described as de Montmort’s, was 
in fact found first by Nicholas Bernoulli. 

De Montmort reproduced (4) in the body of his book, gave an example and added a most interesting 
though far from rigorous demonstration (1713, pp. 268-72). De Moivre at first called the result ‘very 
handsom’ (1718, p. 122), but later criticized de Montmort’s statement of it (which indeed is not entirely 
correct) and seems to hint that he had found the same method of solution before the year 1711 (see 1738, 
pp. 181-2; 1756, pp. 210-11). This is certainly possible, though it may be doubted whether de Moivre had 
carried the investigation of (4) as far as Bernoulli; perhaps he used it in particular cases, but did not 
pursue the matter because his own result gave P, ,+P, , in a smaller number of terms. 

De Moivre later solved the Duration of Play problem in two further ways, and in the course of his work 
made an extensive investigation of recurring series (which he was the first to explore). His results included 
the partial fraction expansion of a generating function (1738, pp. 197-99; 1756, pp. 224-7); he found the 
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probability of runs of successes (1738, pp. 243-8; 1756, pp. 254-9), and of course made the original 
derivation of the normal distribution (1738, pp. 235-43; 1756, pp. 243-50). On the Duration of Play 
problem itself he expressed P, ,, as a recurring series with fewer terms than (4); and finally he discovered 
the first results on the trigonometrical solution (see Feller, 1950, p. 292, equation 5-7), including the 
asymptotic form for P, , when a = b and p = q. For fuller details of his work, and of its subsequent 
development by Laplace and many others, the reader is referred to Todhunter (1865) and Fieller (1931). 

It remains to show the link between these early solutions and modern work. This stems from the well- 
known fact that the Duration of Play situation can be regarded as a linear random walk with two absorb- 
ing barriers, such that the movement of the particle at each jump has a distribution with mean wz = p—q 
and variance 0? = 4pq. To complete the comparison a simple approximation is required, namely 


(p/q)* = exp (2Apz:/o*), (6) 


which can be shown to apply with sufficient accuracy in the cases for which it will be required. 

If then in equations (1) and (2) we make the substitutions (6) and p—q = yp we shall obtain approxima- 
tions for the probability of absorption at a given barrier, and for the expected number of steps before 
absorption at either barrier, in the corresponding random walk; and under the conditions of the central 
limit theorem these will be valid for all walks with given finite ~ and o, provided that the number of steps 
is sufficiently large. It can be seen by inspection that the transformed version of equations (1) and (2) are 
in fact the same as Wald’s approximations for the operating characteristic and average sampie number of 
a sequential test, in the form quoted by Page (1954, equations 5, 7). 

We can similarly transform (3), making the normal approximation to the binomiai expressions; it will 
be found that the result agrees with that given by Bartlett (1946, equation 8), obtained as the solution of 
a differential equation for the diffusion process. It is of interest to note that the same result can also be 
used to find a quick approximate solution of a storage problem considered in a recent paper by Anis 
(1956). This concerns a reservoir, of unlimited capacity, which has initial water level x; this level varies 
each year by an amount distributed with zero mean and unit variance. When n and ~ are sufficiently 
large we can ignore the end-effects and assume that the probability that the reservoir will run dry within 
n years is approximately the same as the probability that B will lose b = x counters within n trials (where 
a is infinite and p = q = 4). By de Moivre’s result (3) this probability will be twice the sum of the first 
4(n—x+ 1) terms of ($+ 4)". Hence, for large n and x the probability that the reservoir will not run dry 


x Vn 
within n years can be expressed approximately as 2 e~*/,/(2m) dt, and it is easy to verify that 


© 
this distribution has the same moment ratios as the limiting values found by Anis. 
Finally, we come to Nicholas Bernoulli’s general soluvion of the Duration of Play. If for any value of 
t either line of (4) is arranged in descending powers of p, it will be found to be the sum of multiples of 
two binomial expressions in the same way as (3)—see also Fieller (1931, equation 10.1), who proceeds to 
obtain the exact solution of the problem in a convenient form as a series of multiples of incomplete 
beta-functions, and also provides a rigorous proof. 


The application of (6) and the normal approximation to the binomial puts the solution in the simple 
~ 4 





approximate form =/ f (2 “skal dx; this series agrees with the (exact) result given by Bartlett (1946, 
av 7 
equation 17) for the diffusion process. In view of the usefulness of this series it is worth repeating here 


for completeness 





Py n = F'(b) —w( —a) F(b4+ 2a) + w( —a—b) F(3b + 2a) — w( — 2a—b) F(3b + 4a)+..., (7) 
A pan A pan 
where F(A)j= pees sb AA Se; a), 
5 (A) (, o ) +0 (+ o 


fo 0) l 
= —j2* J> 
Q(A) =|" Jeam* dx, 
w(A) = exp (2Apu/o*). 


The corresponding series for P,, ,, is found by interchanging a with b and changing the sign of yw in the 
definitions of F and w. 

It will be found that (7) converges rapidly over the range of n where the process is likely to terminate, 
and so (as suggested by Bartlett) provides a rapid approximation for the probability that a particle 
starting at the origin, with a jump distribution having mean y and variance o?, will reach « = 6 (without 
having previously been absorbed at x = —a) within n jumps. It can similarly be used to find the chance 
that a linear sequential test will end within n trials, or that a finite reservoir with random net input will 
either dry up or overflow within a given time. 











518 Miscellanea 


REFERENCES 


Anis, A. A. (1956). Biometrika, 43, 79. 

BartTLeEtT, M. 8S. (1946). Proc. Camb. Phil. Soc. 42, 239. 

De Morvrsg, A. (1711). De Mensura Sortis. Phil. Trans. 27, 213. 

De Morvrs, A. (1718). The Doctrine of Chances, 1st ed. London. 

De Morvrs, A. (1738). The Doctrine of Chances, 2nd ed. London. 

De Morvrg, A. (1756). The Doctrine of Chances, 3rd ed. London. 

Dr Monrmort, P. R. (1713). Essai d’ Analyse sur les Jeux de Hazard, 2nd ed. Paris. 

Feuer, W. (1950). An Introduction to Probability Theory and its Applications. New York: Wiley. 
Frecier, E. C. (1931). Biometrika, 22, 377. 

Paag, E. 8. (1954). J. R. Statist. Soc. B, 16, 136. 

TopHUNTER, I. (1865). History of the Theory of Probability. Cambridge and London: Macmillan. 


Optimal sampling for quota fulfilment 


By N. L. JOHNSON 
University College London 


1. The problem to be discussed in this paper arises in the following way. It is desired to obtain 
a sample from a stratified population in such a way that there are exactly m, individuals from stratum 
w;(i = 1,...,k). It is more convenient to take a random sample from the whole population, and to 
ascertain subsequently the strata to which the chosen individuals belong, than to search for individuals 
belonging to specified strata. Therefore, a first sample of N individuals is chosen without regard to 
stratification and any shortfall is made up by a further set of samples, each restricted to one of the 
deficient strata, and of such a size as to provide the required number of individuals from each of the 
strata. Thus, if the first sample of N contains n,( <m,) individuals from stratum w,; then the subsequent 
sample from this stratum will contain m;—7, individuals but if n;>m,, no subsequent sample from this 
stratum will be required. 

If ¢ is the cost per individual in the first (unrestricted) sample, and c,; the cost per individual for 
a sample restricted to stratum w,, then the expected cost of obtaining the compiete sample is 


k 
C,=cN+ ¥ ¢, &(m,—n;|n¢<m,) Pr{n;<m,}, (1) 
i=1 


where 7, is the number of observations included in w, in the first unrestricted sample. If the unused 
individuals in stratum w; with numbers in excess of requirements are worth c; each the expected cost is 


k 
C,=cN+ & [e,é(m,—n; | ny <m,) Pr{n,<m,}+cf &(m,—n, | ng>m,) Pr{n,>m,}]. (2) 
i=1 
C, can, of course, be regarded as a special case of Cy. 
2. If itissupposed that the joint distribution of n,, ng, ..., 2, is multinomial with parameters 7), Po, ---, Px 
(as would be appropriate if sampling from a large population with proportions 7, Po, -..,p, in strata 
4, gs «+5 Wp, TeSpectively, were being considered) then 


&(m;—n,;) = m;,—Np; 
and (2) can be written 


k k 
C,=eN + & (¢;—¢4) &(m,—n; | nz<m,) Pring<m}+ ¥ ci(m,—Np,). (2a) 
i=1 i=1 


m=1 N N 
Using Gruder’s formula Dd (r—Np) ( )vos r=— m( )pnar—nes, 
r m 


r=0 
C, can be expressed in the form 


mi-—1 


ke  —_— N Kk 
C,=ceN+ D (e, +27) [ om — Wp. x ") plat +m,( ) piegy-mer + Ye(m,—Np,). (3) 
i=1 3=0 \J Mm; i=1 





If 


co 


ob 


or 


wl 


If 


th 


2a) 


(3) 
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If N is large enough and | m;— Np; | (Np;q;)~* is not too large, approximate expressions for the expected 
cost can be obtained by putting 


E(m,;—n; | ny<m,) Pr{n;<m,} = (Np,q;)* (X,1(X;) + Z(X,)), 





m,—Np; 1 ; 1 px 
vhere Xx, = = 2. SX) «ete HX) = — ht de. 
ial nae rae ee jal ” 
k k 
Then C,=cN + p (e;—¢%) (Npiqi)) (X1(Xj)+ A X,)) + Zz ci(m;— Np). (4) 
i=1 i=1 


Approximate values of N minimizing C, can be found by equating the differential coefficients of (4), 
with regard to N, to zero. The resulting equation is 


xe 
4(¢; —¢j) (#4) Z(X,) = ¢,1(X,)+e(1—1(X))). (5) 
- a/ 

3. However, an exact approach leads to a simpler solution in the present instance. If N is increased by 
unity, the probability that the extra observation comes from w, is p;. The cost of obtaining the extra 
observation is c, but there is an amount c; to set against this if w; is a deficient stratum or cj if the first 
sample contains m; or more individuals from w;. Hence, the change in the expected cost is 


k 
AC, = c— & pil(e;— ci) Prin, <m,}+ej]. (6) 
iu 


Optimal values of N are obtained as the least value of N for which AC,>0. These optimal values will 
approximately satisfy the equation AC, = 0, i.e. 


Kk (¢;—¢4) Di; 
> Ci COPE Prin, <3 —} Ih (7) 
“le— >) cjp; 
j=l 
k 
or D> 9: Pr{in;<m,} = 1, (7a) 
i=1 
(c,/c) — (¢;/c) 
where 9: = E Pi 
1— D> (ej/e) p; 
j=1 
If = 0; gi = (¢,/c) D%. 


In the special case of complete symmetry where 
p= 1/k; oje=d; cjeo=d’; m=m, 
] — 7, 
then g; = - ——— and (7a) becomes 
inn oT aaah 


1 
NS ee (8) 


4. Table 1 gives optimal values of N for this symmetrical case for 
k 
m = nearest integer to 50k-1, 100k-1, 200k-1, 500k-1, 
c,/c = d = 1-25, 1-5, 2-0, 25, 3-0, 
ci/c = d’ =0-9, 0-7, 0°25, 0. 


2(1)10, 


Values of ¢,/c greater than one and of c;/c less than one, were chosen for the following reasons. If any 
¢,;’s are less than (or equal to) c it is best to obtain the required sample of m,; from those strata for which 
¢;<c by restricted sampling, and then to use the optimal N for the remaining strata. Similarly, if all 
the cj’s are greater than c the expected cost decreases without limit as N increases. 

The values of N in Table 1 were obtained by using the tables of the incomplete beta-function (ed. 
K. Pearson (1934)) so far as the range of argument of these tables extended. This method could be used 
for nearly all combinations of d and d’ when km = 50 and for some values of d and d’ when km = 100 and 
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k = 2. For higher values of km (the total size of the required sample) calculations were based on the 
approximate formula in --0 


: os 
(N(k—1))3 + “i/kg (9) 
1 Aa 4 
where A is defined by =| e-t'dt = a. 
(27 J —o 

Equation (9) leads to N=k(m—}4)((1+y*?)!-y}?, (10) 

k-1 \3 

where y= Pane) . 

As the size, km, of the required sample increases, y tends to zero and 

N ~ k(m— 4) (1—2y + 2y*) = (mM — $)—Agyg(k(k— 1) (m— 4) + BAjxo(k— 1)- (11) 


This formula gives values for N which are not in error by more than one when km = 50 and can be used 
with reasonable confidence when km > 100 provided & is not too large. 

Table 1 shows that the optimal size iV for the first sample increases with d and with d’, as might be 
expected on intuitive grounds. For k = 2 the optimum N differs only slightly from the required sample 
size km, but as k increases the variation in optimal N with d and d’ becomes more pronounced. 

The minimized value of C, is 


Comin) = e[N + k(d —d’) &(m—n, | n;<m) Pr{n;<m}+kd’(m—(N/k))). 
Since N is chosen to satisfy approximately the equation 
Pr{n;<m} = (1—d’)/(d—d’), (8 bis) 
Comin) = LN + k(1 —d’) {m—(N/k) — &(n;—(N/k) | n,<m)} + kd’(m—(N/k))), 


N 
i.e. Camm em k-+ (da) i ) (k— yma] ’ 


so that the ratio of the minimized cost to the cost of choosing the whole sample by restricted sampling 


within each stratum is 
C(min) 1 d’ 1 N+1 N 
S41 )~— k-1)-™. 12 
mekd 5+ 7) ( k ia ( ) (12) 


Values of this ratio are given in Table 2 (calculated by Miss E. J. Smith) for the following values of 
k, m, d and d’ 
k = 2,3, 4, 5, and 10, 
m = nearest integer to 50k-1, 100k-1, 500k-1, 
d = 1-5, 2-5, 3-0, 
d’ = 0-5, 0-1, 0-0. 

It may be noted that when N <km some restricted samples will be needed, and even for the optimal 
values of N in Table 1 the probability of requiring additional samples must be quite high. 


REFERENCE 
K. PEARSON (ed.) (1934). Tables of the Incomplete Beta-Function. Cambridge University Press. 
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Table 1 
| ee | | | | | 
| Required | d | | | Required | d | | | 
, | sample | 125/15 20) 25! 30] % | Sample 1:25 15 | 20 | 25 | 30 
size : | | =... 
(km) | @ \ | | | | | (em) | | | | | 
| x a ae i ees. thas. _ 
ot | | | | | | | | | 
2 50 0-9 | 53| 56| 59) 61| 62] 5 50 0-9 | 56) 62| 67; 70| 72 
| | O07 | 48] 51] 54] 56| 58 | 0-7 | 47] 53 | 59 62| 64 
| | 025 | 44 | 47/ 50| 52| 53 | 0-25 | 39) 45) 51) 54/| 57 
|} 0 | 43) 46) 49] 51) 52 o | 37 | 42 | 48 | 52/ 54 
| | | | 
100 | 09 | 106 | 109 | 113 | 116 | 117 | 100 | 09 | 109 | 118 | 127 | 132 | 136 
| 07 | 98 102 | 107 | 110) 111 | O7 | 95) 104 | 113 | 118 | 123 
| 0-25 | 93 | 97 | 101 | 103 | 105 0-25 85 | 93) 101 | 106 | 110 
| 0 | 91) 9% | 99 | 102 | 103 0 | 82] 89| 97 | 103 | 106 
| | | | | 
200 | 0-9 | 207 | 213 | 219 | 222 | 224 | 200 0-9 | 214| 297 | 235 | 245 | 250 
0-7 | 198 | 204 | 210 | 213 | 216 0-7 | 195 | 207 219 | 227 | 232 
0-25 190 | 196 202 205 | 208 0-25 181 | 191 203 | 210 | 215 
0 187 | 193 | 199 | 203 | 205 | 0 177 | 186 197 | 205 | 210 
| | | | 
500 | 0-9 | 512 | 521 | 530 | 534 | 538 | 500 0-9 | 523 | 543 | 561 | 571 | 578 
| 0-7 | 497 | 506 | 516 | 521 | 525 | 0-7 | 493 | 512 | 531 | 543 | 551 
| 0-25 | 483 | 493 | 503 | 509 | 513 | | 0-25 | 468 | 486 506 | 517 | 524 
| 0 | 481 | 489 | 499 | 505 | 509 | | 0 | 461 | 479 | 497 | 509 | 517 
a | | | | | ' | | ek 
| | | | | | | 
3 51 | 09 | 55 | 60| 65 | 67| 69] 6 | 48 | 09 | 55| 61| 68| 73| 77 
| 0-7 | 49 | 53/| 57| 60| 62 | o7 | 44| 51| 58| 61] 63 
| 0-25 | 43| 47| 51| 54] 56 0-25 | 36) 42) 49| 53| 55 
| 0 | 42| 45| 49| 52| 54 | 0 | 34 | 39 | 46] 50| 53 
| | | | | 
99 | 09 | 106 | 111 | 118 | 121 124 | 102 0-9 | 112 | 123 | 133) 139 | 143 
| O07 96 | 102 | 108 | 111 | 115 0-7 | 97 | 106 | 117 | 123 | 127 
| 0-25 |) 89 | 94 | 100 | 104 | 108 | 0-25 | 85 | 94 103 | 109 | 113 
| O | 87 | 92] 98/| 101 | 104 | | 0 | 82) 90} 99 | 105 | 109 
| | | | 
201 | 09 | 213 | 222 | 229 | 234 | 237 198 | 0-9 | 213 | 228 | 241 | 249 | 254 
| 0-7 | 199 | 208 | 216 | 222 | 227 | 0-7 | 192 | 205 | 219 | 228 | 234 
| 0-25 | 188 | 196 | 205 | 210 | 213 | 0-25 | 175 | 187 | 201 | 209 | 215 
0 | 185 | 193 | 201 | 205 | 208 0 170 | 182 | 195 | 203 | 209 
| | | | | | 
| 501 | o9 | 518 | 532 | 543 | 550 | 555 | 498 | 0-9 | 524| 545 | 566 | 577 585 | 
| 0-7 | 496 | 510 | 524 | 532 | 537 0-7 | 490 | 511 | 533 | 545 | 555 
| 0-25 | 479 | 492 | 505 | 513 | 519 | 0-25 | 463 | 483 | 504 | 517 | 526 
| 0 | 474 | 486 | 499 | 508 | 513 | 0 455 | 474 | 495 | 508 | 517 
= ‘ee ee | x eee lay, Hse | 
| | | | | | | | 
4 52 0-9 58 | 63| 67| 69| 71] 7 | 49 0-9 57 | 64| 73! 77| 81 
0-7 49 | 54| 60| 63) 65 | 0-7 45| 52| 60| 63/| 67 
0-25 | 43] 47| 53| 56/| 58 | 0-25 | 36) 43) 50, 54) 58 
0 41 | 45| 50) 54) 56 0 | 34] 40| 46| 51| 54 
| | | | | 
100 0-9 | 108 | 116 | 123 | 128 | 131 98 0-9 | 109) 120 | 132 | 138 | 143 
0-7 96 | 104 111 116 | 119 0-7 92 102 | 114 | 120 | 125 
0-25 | 87) 94)| 101 106 | 109 0-25, 80 89) 99) 105 | 110 
0 85 | 91, 98 | 103 | 106 0 77| 85) 94 101 | 105 
| | | | 
| } | 
200 0-9 | 212 | 293 | 233 | 239 | 243 | 203 0-9 | 221 | 237 | 253 | 262 | 268 
0-7 | 196 | 205 | 217 | 223 | 228 | | 0-7 | 196 | 211 | 228 | 237 | 244 
0-25 | 182 | 192 | 203 | 209 | 213 | 0-25 | 177 | 191 | 206 | 216 | 222 
| 0 179 | 188 | 198 | 204 209 0 171 | 185 | 199 | 209 | 216 
| | | 
} | | | 
500 0-9 | 520 | 537 | 552 | 561 | 567 | 497 0-9 | 531 | 555 | 578 590 | 599 
0-7 494 | 510 | 527 | 537 | 544 | 0-7 | 494 517 541 555 | 566 
0-25 | 473 | 488 | 505 | 515 | 522 | 0-25 | 464 | 486 | 509 | 524 | 534 
0 | 467 | 482 | 498 | 508 | 515 | 0 | 456 | 477 | 499 | 514 | 524 
| | | | | 
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Table 1 (cont.) 





























pee Pi | \ | | | 
| Required . d | Required a 
s | : | | | | | 
; | Sample 125| 15 | 20/25/30] & | Sample | 1:25! 1:5 | 20! 25 | 30 
size | » | size |, | | | 
(km) d \ | (km) |4 | 
| | 
“i ; | | Dew Te eee Gee ee | 
s 48 0-9 56, 64| 73| 79| 83] 9 | 198 0-9 | 217 | 235 | 253 | 263 | 270 
0-7 43 | 51 | 59| 64| 68 0-7 | 189 | 206 | 225 | 235 | 243 
0-25| 34| 40| 48] 53| 56 0-25 | 169 | 184 | 201 | 211 | 219 
0 31| 38| 45] 50| 53 | | 0 | 163 | 177 | 193 | 204 211 
| | 
104 0-9 | 116 | 129 142 | 149 | 154 | 504 | 0-9 | 537 | 565 | 591 | 606 616 
0-7 97 109 , 121 | 129 | 134 | 0-7 | 493 | 520 | 548 565 577 
0-25 84| 94) 105 | 112} 117 0-25 | 459 | 484 | 511 | 528 | 539 
0 80 89 | 100 | 107 | 111 0 449 | 473 | 499 | 516 | 528 
| | 
200 0-9 | 228 | 235 | 252 | 261 | 268 [| _ | | aa Ge 
0-7 192 | 208 | 225 | 235 | 243 | 10 50 0-9 59 | 69 | 80: 86. 91 
0-25 | 173 | 187 | 203 | 213 | 220 | 0-7 | 44 53 | 62 69 | 73 
0 167 | 178 | 196 | 206 | 213 | 0:25 | 34 41 | 50 55 | 59 
| 0 31| 38| 46| 52| 55 
504 0-9 | 533 | 561 | 585 | 599 | 608 | | 
0-7 | 494 | 519 | 545 | 561 | 572 100 | O-9 | 113 | 127 | 142) 150 156 
0-25 | 462 | 485 | 511 | 526 | 537 0-7 92 | 105 | 119 | 127. 134 
| 0 453 | 475 | 500 | 515 | 526 0-25 | 73 gs | 101 | 108 | 114 
itcuaal | 0 | 67| 83| 95 103) 103 
| | | | 
9 54. | «(09 62 | 73| 83| 89| 94 200 | 0-9 | 220 | 240 | 259 | 270° 217 
| O07 49| 56; 66| 73} 177 | O-7 | 191 | 208 | 228 | 240 | 248 
0-25| 38| 46| 55| 59! 63 0-25 | 169 | 185 | 203 | 214 | 222 
0 86 | 43] 51 56 | 59 0 | 163 | 178 | 195 | 205 | 214 
| | 
| 99 0-9 | 111 | 125 | 138 | 146 | 152 | 500 0-9 | 534 | 564 | 592 | 608 | 619 
| 0-7 92 | 104 | 117 | 125 | 130 | 0-7 | 489 | 517 | 547 | 564 | 577 
0-25 77) 88) 100 107 | 113 | 0-25 | 451 | 478 | 508 | 525 | 537 
0 74 | 84! 94! 102 107 | | 0 | 442 | 467 | 495 | 512 | 525 
| | | | | 
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Table 2. Ratio of expected cost of optimal sample to cost of restricted sampling 
| | 
| 
| j | 
k | m d = SESE EREREREEAEENCE ino 
| | 
15 | 2-5 30 | 
Tae es. Se pa ey 
| 
2s i. 25 0-5 0-704 | 0-437 0-369 
0-1 0-711 | 0-451 | 0-382 | 
| | 0-0 0-716 | 0-455 0-386 
$0 | 2 | 50 | 065 0-693 | 0-426 0-357 
| | | en 0-700 | 0-437 0-368 
BP. | | 00 | ©0703 | 0-439 0-371 | 
ee 2 | 250 | 06 | 0679 | 0-412 | 0-344 | 
pd | | O1 | 0-682 0-416 | 0348 | 
a 00 | 0-683 0-417 | 0-350 | 
| | 
211 | | 
3 17 05 | O719 | 0452 | 0-383 
616 0-1 | 0-733 0-473 | 0-403 
577 0-0 0-733 0-478 | 0-408 
ps 3 33 0-5 0-704 0-438 =| (0-368 
: 0-1 0-717 | 0-453 | 0-382 
— 00 | O717 | 0-456 | «0-886 
91 3 167 = 05 =| 0-682 0-416 | 0-348 
73 | Ol 0-688 0-423 «| 0-355 
59 | 0-0 0-689 | 0-425 | 0-357 
55 
| | 
2. 4 13 | 05 | 0-730 0-465 | 0-386 | 
. | o1 | 0-751 0490 | 0-417 | 
on | | 00 | O-749 0-493 0-423. | 
| | | 
108 = 25 | O58 | O712 | 0-446 0-377 | 
4% | | Ol | 0-728 0-464 0-394 | 
rv | | 0-0 0-727 0-468 0-398 | 
222 | 4 125 | 05 6©6| 0-687 0-420 0-352 | 
214 | | 0-1 0-694 0-428 | 0-360 | 
| 00 | 0694 | 0430 | 0-362 | 
619 | | | | | 
577 | 6 10 | «605 | 741 | 0476 | 
oi | | | 2 |-—- 0-504 0-434 
vg | | | 0-0 | 0-762 0-510 0-440 | 
| | | 
} 5 | 2 | o8 | one 0-455 | 0-382 | 
| o1 | 0-782 0-474 | 0-404 | 
| | 00 | 0-735 0-479 0-409 | 
| | ] | 
| 5 | 100 | 058 | 0-690 0-424 0-355 | 
| | | 0-1 | 0-697 0-433 0-365 | 
| | 00 | 0-699 0-435 0-366 | 
| | 
10 5 | O58 | 0-786 0:515 0-445 | 
| 0-1 0-801 0-558 0-487 | 
| | | 0-0 0-806 0-565 0-493 | 
10 =| 10 | 06 0-745 0-482 0-409 | 
| 01 | 0-763 0-512 0-441 | 
| 0-0 0-767 0-518 0-448 | 
10 | 50 | 065 0-702 0-442 0-366 | 
| OL | O71] 0-450 0-380 | 
| 0-0 0-714 0-453 0-383) 
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The distribution of intervals between successive maxima 
in a series of random numbers 


By D. 8. PALMER 
Marconi’s Wireless Telegraph Co., Ltd 


Let u(r) be the rth in order of occurrence of a series of independent random numbers. The n + 3 numbers 
u( —1), u(0), ...,u(m+1) can oceur in (n+ 3)! different orders when placed in ascending order of magni- 
tude, and these orders have all equal probability independently of the frequency distribution of the w’s; 
cases where two or more w’s are equal are here considered to have probability zero. 

Let Q(n) be the number of these orders which have u( — 1) < u(0) > u(1), butno furthermaxima, although 
u(n +1) may exceed u(n). Of these, let g(m,n) have u(n+ 1) in position m from the bottom. Then 


n+3 
Qin) = LY gm,n). (1) 
m=1 


Consider the Q(n) orders, and the positions in which a further number u(n + 2) may be placed to make 
an order of the group Q(n +1). Four cases must be distinguished 


(A) u(n+1)<u(n), u(n+1)<u(—1). 


u(n+1) must have the lowest place and u(—1) may have any of n+1 places in the ordering of 
u(—1)...u(n+1). Hence g(1,n) =n+1. u(n+2) may have any place, so this case contributes n+ 1 to 
each of g(1,n+1), g(2,n+1)...¢(n+4,n+1). 


(B) un+1)<u(n), u(n+1)>u(—1). 


Only one ordering of u(—1)...u(m+1) is possible, and u(n+2) may have any place. There arises 
a contribution of 1 to q(1,n+1),q(2,n4+1),...,¢(n+4,n+1). 


(C) u(n+1)>u(n) and u(n+1) has second place. 


There are n possible positions of u(— 1) with regard to u(0)... u(n+ 1). u(n+ 2) may have any place but 
the two lowest. There arises a contribution of n to q(m,n+ 1) for each m> 2. 


(D) u(—1)...u(n+1) belongto gq(r,n) (r>2). 


u(n+2) must >u(n+ 1), contributing g(r, 7) to each g(m,n+1) with m>r. 
From cases (A), (B), (C) it follows that 


qin) = q(2,n) =n+1. (2) 


Summarizing the contributions to q(m,n+ 1) we have 





| | 
q(1,n +1) | q(2,n+1)| q(m,n+1) 
| 








| 
| 
| 


| 


| (A) | n+l n+1 n+1 

| (B) | 1 1 1 
(C) 0 0 | n(m>2) | 
(D) 0 | 0 | q(r,n) (m>r) | 


| | | 





Therefore g(1,n +1) = q(2,n+1) = n+2, agreeing with (2) and 
m—1 
g(m,n+1) = 2(n4+1)+ ¥ ar,n). (3) 
r=3 
Assume that 


-—3 
q(m,n) = 2™-2 (»-"5*). 3<m<nt3. (4) 





(4) 


are 


Th 


The 


PS 


ut 


(2) 


(3) 


(4) 
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moi r—3 
Then qm,n+1) = An4+1)+ ¥ -+(n—7S*) 
r=3 


3 — jm-1 
= 2(n+1)+ (»+5) > 2°-8-- > rar 
2] r=3 2 r=3 


m—4 m—4 
= 2(n+1)+(2n4+3) } 2°-— DY (r+3)2" 
r=0 r=0 


m—4 m—4 
= An4+1)+2n } 2°-— DY 2" 
r=0 r=0 


= 2(n+ 1) + 2n(2™-3 — 1) —(m—5) 2™-3—-2 
= 2n-a(n +1 "53 agreeing with (4). 
(4) is valid for n = 1 and 2 as may be seen by the following enumeration. Orders for which u(n + 1) < u(n) 
are enclosed in brackets. 
n=1 q(l,1)=2 (2431), (3421), 
(2,1) =2 (1482), 3412, 
q(3,1) = 2 2413, 1423, 
q(4,1)=2 13824, 2314. 
n=2 q(1,2)=3 (25481), (45321), (35421), 
(2,2) =3 (15432), 45312, 35412, 
(3,2) =4 15423, 25413, 45123, 45213, 
q(4,2) = 6 15234, 15324, 25134, 25314, 35124, 35214, 
(5,2) = 8 13245, 14825, 14235, 23145, 24135, 24315, 34125, 34215. 
Thus the validity of (4) is proved by induction. 


n+3 
By (3) q(n+4,n+1) = 2n+1)+ & gr,n) 
r=3 
nt+3 
= DY ar,n) by (2) 
r=1 
= Q(n) by (1). 
Hence, by (4), Q(n) = 2"+2%{n+1—43(n+1)} = 2"4(n+4+ 1). (5) 
Of the (n+3)! orders of u(—1)...u(n+1), one-third satisfy the condition that u(—1)<u(0)>u(1). 


Therefore : F . 
P(n) = chance of an interval of n between successive maxima 


= Q(n—1)/}(n+ 2)!—Q(n)/H(n+4 3)! 


_ 8n2" — 3(n+1)2"41 
~ (n+2)! (n+3)! 








= 3-2"(n—1)(m+2)/(n+3)! (n>B2). (6) 


Then for the moments we have 


E(e") =3 > et 2"(n—1)(n+2)/(n+3)! 


n=2 
= e%' (3¢-t— Ze-2t + 3e-8} — Se-3t + et + | 
= 14 3t+($e2—6) f+... (7) 
The first two terms are obviously correct, and the third gives 


$.D. = (3e2— 21)! = 1-080. 
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The first 2000 two-figure random numbers in Kendall and Babington Smith’s Tables (Tracts for 
Computers No. XXIV) give 





n P(n) Expected Observed - | 
2 2/5 266-0 251 | 0-85 
3 1/3 221-7 241 1-71 
4 6/35 114-0 a 
5 1/15 44-3 50 | 0-73 
>6 1/35 19-0 16 0-47 
| 
665 665 | 























Mean 3-01, s.p. 1-04, vy? (4d.f.) = 4-19. 


I am indebted to Marconi’s Wireless Telegraph Co., Ltd. for permission to publish this note. 


The effect of ties on the moments of rank criteria 


By B. E. COOPER 
University College, London 


1. Ina recent paper (1956) F. N. David remarked that the moments of any criterion based on ranks 
could be derived by using Wishart’s (1952) tables of sampling formulae for a finite population of N, the 
K-statistics of such a population being those of the first N natural numbers. In this present note we use 
this method of approach to find the moments of the mean rank of a sample of n randomly drawn without 
replacement from a population consisting of the first N natural numbers, ties being allowed. We also 
obtain the effect of ties on the variance of the sample variance. 

2. Where ¢ individuals tie we shall follow Putter (1955) and assign the mean rank in the tie to each 
individual. We first find the effect of ties on the moments of the finite population. Let the (p + 1)st to the 
qth ranked element of the population tie with (p+ 1)<g. We assign the common rank m = 3(p+q+1) 
to each, and the contribution to the rth factorial moment of the population (i.e. the change occasioneri 
by replacing the numbers p+ 1, p+ 2,...,q by m,m, ...,m) is thus 


1 1 
Ay = WV {om — re (q+ 1)"t)—(p+ yr} 


—1 4 t+1\0 a 
5 r+1—j re, a 1 
~ N+) TAY j ‘)m “ (5 2 *) ( 2, } " 


Now suppose there are 7’ groups of ties with ¢, in the ith group which has common rank x; + 4(N + 1), for 
i= 1,2,..., 7’. The changes in the moments due to each group are additive, and so, after a little reduction, 
the population central moments are seen to be 


N+1 1 T 
i=—, N(N?-1 t(ti—1 
hi 3 fe, = nxt ( \— = i( } 

Vs = nw on (tj —1), 


1 
240N 





r 
fy = (wwe 1)(3N2—7)— Dd t(e?- nat 7}— Dd x}e,(t?- 
i=} as 


We give the moments rather than the K-statistics since these former are simpler. 





ks 
he 
ise 
ut 
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- 1) 
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(2) 
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3. If it is assumed that two samples n, and n, are arranged together in ascending order of magnitude, 
(N = n+), then Wilcoxon’s procedure consists of finding the mean of the ranks of one sample, and 
using this criterion to test for the equivalence of the location parameters of the two populations 
generating the samples. The moments of the mean of the first sample (ties not admitted), may be 
written down immediately from Wishart’s tables. When ties are allowed, the first moment is unchanged, 
and the second moment becomes 





(2%) = 


n(N—n,)(N +1) fae 
n [1- aia y &" (@- |. @) 


The third and fourth moments do not simplify at all and they are accordingly most simply computed 
in any instance by putting numerical values of (2) into Wishart’s formulae. 

The variance of % given in (3) above has previously been derived by Hemelrijk (1952), Kruskal (1952) 
and Putter (1955). Computations were carried out for varying values of N and ,, and for different 
numbers of tied observations. It would appear for such values of n, and N for which the normal approxi- 
mation can be used in significance tests that the effect of ties is not likely to lead to serious error, provided 
that not more than half the observations are tied. 


4. Other procedures have been put forward whereby tests can be carried out for possible differences in 
the parameters of dispersion of two parent populations.. David (1956) suggested the variance of one 
sample might be used, while Mood (1954) proposed W, the sum of the squares of the deviations of the 
ranks of one sample from the population mean. It is clear that the same algebraic procedure used above 
for the sample mean can be used to derive corrections to the moments of both David’s and Mood’s 
criterion when ties are allowed. In fact these two criteria are merely the sample variance and the sample 
sum of squares about 4(N + 1) of asample from the finite populations whose first four moments are given 
by (2). 

Thus using Abdel-Aty’s (1954) vail formulae, the first two moments of Mood’s statistic become 


= 
é(W) = Tr ay NOD 1)- % t— n}, 





For David’s statistic, k,, we have similarly, 


T 
i= 


var(k,) = stm —N—=n,-1)(N—1) 4+ (3(N- 1)? +m(N?—3)) 43} 
"s n(n, — 1) (N—1)2(N—2)(N—3) : 


The variances of both these statistics do not simplify if the explicit values of ~, and 4 from (2) are 


substituted. It is simplest therefore to compute them, in any given instance, using the numerical values 
obtained from (2). 





5. Although the correction to the variance of the mean has been given by several persons, as far as the 
writer knows there has been no discussion of corrections for higher moments of the mean, or for other 
criteria. The general expression in (1) enables as many corrected moments of the population as desired to 


be calculated, as in (2). Accordingly the only limitations are set by the extent of Wishart’s and Abdel- 
Aty’s tables. 


REFERENCES 


ABDEL-Aty, 8S. H. (1954). Ph.D. Thesis, University of London. 
Davin, F. N. (1956). Biometrika, 43, 485. 

HEMELRIJK, J. (1952). Ann. Math. Statist. 23, 133. 

Kruskal, W. H. (1952). Ann. Math. Statisi. 23, 525. 

Moon, A. M. (1954). Ann. Math. Statist. 25, 514. 

Putter, J. (1955). Ann. Math. Statist. 26, 368. 

WisHart, J. (1952). Biometrika, 39, 1. 


34 Biom. 44 








528 | Miscellanea 


Approximations to the upper 5 % points of Fisher’s B distribution 
and non-central y?* 


By JOHN W. TUKEY 
Princeton University 


1. SUMMARY 


Some time ago the writer reported (1949) on an empirical approximation to the upper 5% point of 
Fisher’s B distribution, the distribution of the square root of a non-central y?, comparing the approxima- 
tion with certain of the exact values given by Fisher (1928) and suggesting that the empirical formula 
might ‘be useful for moderate extrapolation’. At about the same time Patnaik (1949), and later Abdel- 
Aty (1954), studied the problem of approximating the distribution of non-central y?. Through ihe efforts 
and courtesy of Dr B. I. Harley, of University College, it is now possible to present a comparative table 
of the results of various approximations, including the streamlined use of moment-fitt<-! Pearson curves 
by the aid of Table 42 of the new Biometrika tables (Pearson & Hartley, 1954). 

While the empirical approximation is restricted to the upper 5% point, only the moment-fitting 
approach seems likely to give more accurate values in the range considered. Because of its simplicity, 
the empirical formula may prove useful. 


2. DisTRIBUTION AND APPROXIMATIONS 
The simplest definition of a non-central y* quantity is 


ait agt...+23_,+(t,+f)? = x? = B, 


where the z’s are independent unit normals and A = f? is fixed. In discussing the B distribution, Fisher 
used / and n, = n. The empirical approximation to be considered is 


™%—2 0-024 (m—5)(m—1) 
Bri A(B +1) 
(The simplest way to use this for a 5 % point of y’? is to find By», and then square the result.) 
Patnaik’s approximation begins with inverse interpolation in the percentage point tables of ¥? (first 
approximation, pp. 207-8) and then follows with a Cornish—Fisher expansion (second approximation, 
formula (25), page 213). A substantial amount of computation is required. 
Abdel-Aty’s approximation begins with a cube-root transformation and continues with an expansion of 
Cornish—Fisher type. The first approximation, that {y’2/(n+A)}} = Bi/(n, + £?)t is roughly normally 
distributed about 1 — o? with variance 


Boos © 1:6449+£+0-51 (A) 


2n+2A 2n+2f? 
= — = — 
9(n+AP> 9(n+f?)? 
is quite simple, but the ‘closer approximation’ involves considerable computation. 
The approximation involved in the use of Table 42 of the new Biometrika tables corresponds to fitting 


a Pearson curve to the first four moments of x’, and accepting the percentage points of the fitted Pearson 
curve. 





3. ACCURACY AND COMPARISON 


Fisher’s table covers n, = 1(1)7 and £ = 0(0-2) 5-0. Comparison of exact and approximate values at 
a reasonably rude grid is made in Table 1, which covers n, = 1(2)7, #4 = 1(2) 5. The disagreement is 
nowhere worse than 0-035 and reaches no more than 0-0037 for {> 3. The fit is encouraging. 

The comparison between the different methods of approximation is made in Table 2 in terms of non- 
central y? upper 5 % points, squares of By, values. As noted in the summary, this table was undertaken 
and computed by Dr B. I. Harley. 

The values obtained through moments and Table 42 require the computation of the first four moments 





of vy”, and in particular of 8(1+ 2b)? 12(1 + 3b) 
5, = —__——, = are >» 
A=asoe * r(1+6)? 

where r=nt+A=n, +f? and b=A/(n+A) = f2/(n, +f). 


Once these values are available, the use of Table 42 is quite simple and direct. It provides approximate 
values for a moderate selection of percentage points, and its accuracy at the upper 5 % point is excellent. 


* Prepared in connexion with research sponsored by the U.S. Office of Naval Research. 
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Table 1. Accuracy of approximation (A) 
| 
B Value nm, =1 n,=3 nm, =5 | %=7 
1 Exact | 26461 3-1941 36291 | 40005 
Approx. | 2-6449 3-2029 3-6649 4-0309 
P 3 Exact | 4-6449 49055 | 51512 5-3840 
Approx. | 46449 49079 | 51549 5-3859 
| 
: 5 Exact | 6-6449 6-8162 6-9831 7-1457 
‘ Approx. | 6-6449 6-8181 | 6-9849 7-1453 
e 
Ss 
Table 2. Approximations to the upper 5% points of the non-central x? distribution 
', | | Abdel-Aty’s 
| | Patnaik’s approx. | approx. Moments 
| n=ny | A= f Exact — | | and 
| | | | | Table 42 
| | Ist | Q@nd | Ist | 2nd 
o | Pee Pe. eer 2 Ba ad 
| | | | | | 
- a ee 8-64 | 862 | 863 | — 856 | 838 | — 
+ 14-64 14:65 | 14:72 | 14-67 | 14-66 14-62 | = 
16 33-05+ | 33-07 | 33:35 | 33-06 | 33-32 33-08 | 33-07 
\) 25 45-31 | 45:32 | 4566  — | 45-64 | 45-33 — 
| | | 
oy er - i 2 2 eee oe ee eee 
st | | | | | | 
n, ) 4 | 2. | wn | ier | iw | — | 11-67 | 1167 | — 
| @ | Oe 17-36 17-38 | (17-33 | 17-34 | 1727 | * | 
of 16 | 35-43 35-46 35-69 | 35-42 | 35°66 | 35-44 | 35:42 | 
ly | 25 | 47-61 47-64 | 47-94 | — 47-91 | 47-62 —_ 
| | 
7 1 16-00 | 1625-| 1601 | — | 15-98 | 1599 | — 
4 | 21-23 21-32 | 21-28 | 21-27 | 21-256 | 21-21 | 21-23 
ee. ae. oe ee ee ere ee 
ng 16 | 38-97 38-97 | 39:16 | 38-97 | 39-16 | 38:96 | 38-95 
on 25 | 51-06 | 51-06 | 51-34 | — | 51:33 | 51:06 | — 
| | | | | | 
* These lie outside the #, range for Table 42. 
at 
+ is There is some reason to suspect, however, that its accuracy may be close to its best at this particular 
percentage point. 
one If other than an upper 5 % point is desired, the method using Table 42 seems simplest and as likely to 
ren be accurate as any. For an upper 5 % point, approximation (A) seems simpler and almost as accurate. 
(The user demanding highest accuracy may well consider applying Table 42 to Abdel-Aty’s cube-root 
ots transformation, since the formulas for its #, and /, are given at the foot of his page 538.) 
4. SOURCE OF APPROXIMATION (A) 
For large £, i 
B= py(i+2) nat... +2i . 
ate p vis 
ent. 2a 1 1 
whence B= (1+ 3+0(5)| = p+2,+0(5), 
PY op + O\p p 
34-2 
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so that the upper 5 % point of B— # approaches 1-6449. This fixes the first two terms of (A). The last two 
came from empirical examination of a table of By,»,—1-6449—/. Presumably similar approximate 
formulas could be obtained by similar methods from brief tables of other exact percentage points. 
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Uniqueness of a result in the theory of accident proneness 


By N. L. JOHNSON 
(University College, London) 


1. A simple type of model for accident proneness (see Greenwood & Yule (1920), Arbous & Kerrich 
(1951), Bates & Neyman (1952), and Arbous & Sichel (1954)) is obtained by supposing 

(i) the number of accidents sustained by a given individual in a period of fixed (‘unit’) length 
(e.g. 1 year) may be represented by a Poisson variable x with 


Pr {xz} = aX (x = 0, 1, 2,...), 


and (ii) A (>0) varies from individual to individual with a probability density function p(A). 
If p(A) be assumed to follow the Pearson type III distribution 


pr) = (£)° 1 e-aimane (0<a) 
m) T(k) 7 


then the overall distribution of observed number (y) of accidents per individual in a given period of unit 
length is 


Pr{y} = rae 


k\* 
oh Ak+v-1 e-+kimA dj, 
(5) 5 T(k) — J. . 


"y y ! ea ) aa ) 
~ T(k) T(y+1) +k m+k)} ’ 
m? 


P R ye Da : , m m 
i.e. a negative binomial distribution with mean k(m/k) = m and variance k z (: + *) = m+ >" 
2. Now, let (y,z) be the numbers of accidents sustained by the same individual in two successive 
periods of unit length. Assuming that for each individual A retains the same value in the second period as 
it possesses in the first, we find, using Bayes’ theorem, 





ANIA 
a ee: (1) 
© -ANMDIANGA 
Hence Pr {z| y} = [oerZoa |y)da 
0 ‘ 


* Ak+tute-1 e2+kimaA dA 
ie 0 
D(z+ 1) ig Ak+v-1 g-l+kim)A qr 


_ TMk+y+z) (ete) ( m J 


T(k+y) P(z+1) \Q2m41 2m+k 








a = Leet 





al 


it 
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The conditional distribution of z, given y, is therefore a negative binomial with mean (k + y) a. and 
variance m+k 
k 2m +k 
een (1 - )=-™ =. 
t+ 


+k m+k (m +k)? 


The regression of z on y is linear (see also Arbous & Sichel (1954)). 
It may also be noted that &(z|y = 0) = mk/(m+k), while &(z) = &(y) = m, so that selection of 


individuals free from accidents in the first period reduces the accident rate to be expected in the second 
period by a factor k/(m+k). 








3. It is the purpose of this note to show that if p(A) is not a type III probability density function, the 
regression of z on y is not linear. 
Since &(z| A) = A it follows that 


é(z|y) = I, Ap(A|y) da. 
Hence, from (1) 


é(z|y) = @ eA avsapiay an | {* eA Ap(A) da 


and so, for the regression of z on y to be linear, we must have 


ioe) 
i) Av+te-A p(A) da 
ttf. (2) 
I AY eA p(A) d. 
0 
where a and f are constants satisfying « + #m = m (since &(z) = &(y) = m) and #>0 (since &(z | y) >0 for 
all y). 
Equation (2) can be rewritten 
Moss 
a+ py = —, (3) 
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where y; is the rth moment about zero of the probability density function 


oo 
erp | | eAp(A)dA (A>0). (4) 
0 
Since 4g = 1, it follows that 
Mm =a>0, we=a(a+f), wy = a(a+P)(a+2f), 
and in general 
Ms = ala +f)... (a+(r—1) A). 
Hence fr<(f;)" and so lim (ut'"/r)<f’, where f’ = a+. 
To 


Hence the moments {;} determine the distribution (4) uniquely; and p(A) is, therefore, also determined 
uniquely by (2). A function (A) of type III form does satisfy (2), and this is therefore the only form of 
probability density function satisfying (2). 
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Miscellanea 


A note on the mean deviation of the binomial distribution 


By N. L. JOHNSON 
University College, London 


In a paper dealing with the mathematical theory of risk in insurance Gruder (1930) has shown that 
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This result may be applied to the evaluation of the mean deviation of a binomial distribution, an 
application which, it is believed, is new. The mean deviation of the binomial distribution is equal to 
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Hence, using (1), the mean deviation of the binomial distribution is 
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The ratio of mean deviation to standard deviation is 
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Applying Stirling’s formula in the form 
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The table below compares the exact formula (3) with the approximate formula (4) for some typical 
values of n and p. The ratio takes the same value for p = x and p = 1—2, so values of p are restricted to 
the range 0-1 — (0-1) — 0-5. It will be noted that the Normal limiting value of ,/(2/77) = 0-7979is approached 
quite rapidly as n increases. The approximate formula (4) gives nearly three figure accuracy for 
0:2<p<0-8, while for n> 50 the exact and approximate values agree to four decimal places. 


Miscellanea 





P | 0-1 | 0-2 0-3 0-4 0-5 
iw (3) (4) | (3) (4) (3) (4) (3) (4) (3) (4) 











| 10 | 0-7351 0-7313 0-7640 0-7630 | 0-7733 0-7729 0-7771 0-7768 | 0-7782 0-7779 


20 | 0-7652 0-7642 0-7807 0-7804 | 0-7855 0-7854 | 0-7874 0-7874 | 0-7879 0-7879 





























50 0-7846 0-7909 | 0-7929 0-7937 0-7939 
| 100 0-7912 0-7949 0-7954 0-7958 0-7959 
REFERENCE 


GRUDER, O. (1930). 9th International Congress of Actuaries, vol. 1, p. 222. 


A comment on D. V. Lindley’s statistical paradox 


By M. 8. BARTLETT 
University of Manchester 


Tread with considerable interest the discussion by D. V. Lindley (1957), demonstrating the possibility of 
contradiction between the result of a statistical significance test and an assessment of the posterior 
probability of a null hypothesis. I would agree that he establishes the point that one must be cautious 
when using a fixed significance level for testing a null hypothesis irrespective of the size of sample one is 
taking. However, there is a slip, in his expression for K under his equation (1), that appears to me, unless 
corrected, to lead to an overstatement of this point. The prior distribution for 0, given that 0+6,, was 
assumed to be uniform over an interval J, and hence its density function should be 1/J in this interval. 
This leads to the extra factor 1/J in the second term in the expression for K.* This expression then becomes 
consistent with Jeffreys’s equation (10), § 5-0, in his book (second edition, 1948). 

The occurrence of J in the formula for the posterior probability ¢ of the null hypothesis 4, this quantity 
¢ satisfying approximately the relation 


ee VS I n —n(%—,)? 
aimed 20% |: . 


where c is the prior probability of 6,, now makes the value of ¢ much more arbitrary. In fact, in sitvations 
where one might be tempted to put J infinity the silly answer ¢ = 1 ensues. D. V. Lindley has suggested 
to me, in correspondence, that one way out of this dilemma would be to make ¢/(1—c) the prior odds in 
favour of the null hypothesis against any unit interval of the alternative values, but this is rather an 
artificial evasion of the difficulty. It is common for those who use the Bayes’s approach to assume 
a uniform prior distribution for the mean of a normal population. If the difference in means between two 
populations (for simplicity, of known equal variances) is considered, the question might legitimately be 
asked whether these populations, from each of which a sample is available, are identical. The most natural 
prior probabilities would seem to me, if we try to use the Bayes’s approach, to be c for this null hypothesis, 
and a remaining uniform prior distribution of the true difference in population means over the entire 
infinite range. 

The other point that might be noticed about formula (1), if we disregard the above difficulty and agree 
to leave I finite, is this. Certainly, for a fixed significance level (and J and @ fixed), the posterior odds 
increase with ,/n. But from the Neyman-Pearson theory of the power of tests, if a null hypothesis 6, is 
being tested against a single alternative 0,, the sample size n would if possible be chosen in relation to 
the ‘distance’ d= 6, — 0, between the two hypotheses (,/n inversely proportional to d). The situation under 
discussion is a little more complicated but, with a range of J for the alternatives, it would be fairly 
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reasonable to choose the sample size n analogously, making ,/n proportional to 1/I. If we write /n = Ao/I, 


we obtain 
é c A 
—_— 1° 
1-@ slaen |, (2) 


where A = ,/n(¥—9,)/o; in (2) there is a constant relation between ¢ and A for fixed c. 


REFERENCES 
JEFFREYS, H. (1948). Theory of Probability, 2nd ed. Oxford: Clarendon Press. 
Lixptey, D. V. (1957). A statistical paradox. Biometrika, 44, 187-92. 


Editorial Note. Mr Lindley agrees and apologizes for the fact that a factor 1/I was omitted from his 
equation (1), but points out that in the two examples which he discusses this factor is unity. His general 
argument as to the limiting value of ¢ is in any case unaffected, and his two particular examples are also 
unaffected. There appears to be no real difference of opinion between Prof. Bartlett and Mr Lindley on 
this point. 

The point raised by Prof. Bartiett’s second paragraph is related to the difficulty of laying down a uni- 
form prior probability for a parameter of infinite range, a point which, in my opinion, has not been 
properly cleared up; if the probability of uw is kdu, integration over the infinite range leads to the con- 
clusion that k is zero (the integral having to be unity). The root of this difficulty seems to be that several 
limiting processes are involved and no clear rules have been laid down as to which, if any, has priority. 
In any case, this point mainly concerns estimation, whereas Mr Lindley was concerned with testing 
hypotheses. 

In regard to Prof. Bartlett’s final point, it may be useful to observe that some procedure of the type he 
suggests is implicit in the idea of the asymptotic relative efficiency of a test. In general, the power of a test 
against a specific alternative tends to unity with increasing sample size. To compare two tests asympto- 
tically at a fixed significance level, it is necessary to allow the alternative to approach the null hypothesis 
as the sample size increases. M.G.K. 


* There is also a further dropping of a factor 1/o in the last formula on p. 191, but this is a more 
trivial slip. 


CORRIGENDA 
Biometrika (1957), 44, pp. 150-8 


‘The use of a concomitant variable ix collecting an experimental design.’ By D. R. Cox 


Dr K. R. Nair has kindly pointed out that some of the results for Methods II and V of 
the above paper have been given by him in Sankhya (1942), 6, 167-174. He has also noted 
that in formula (6) of my paper (%;—%,)? should read k(%;—%,)*. D.R.C. 


Biometrika (1957), 44, pp. 168-78 


‘Multiple runs.’ By D. E. Barton and F, N. Davip 


P. 174, line 10. Delete ‘transition probabilities’ and substitute ‘joint probabilities of two 
successive events’. 


P. 174, line 13. Delete the full stop after the expression for W and add ‘and where 
p, = 1k, (= 1, 2, ..., be)’. 
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REVIEWS 


Proceedings of the Third Berkeley Symposium on Mathematical Statistics and 
Probability. Edited by Jerzy Nryman. University of California Press, for whom 
Cambridge University Press act as agents. 1957. Vol. 1. Theory of Statistics, 
pp. 208, 45s. Vol. Il: Probability Theory, pp. 246, 49s. Vol. III: Astronomy and 
Physics, pp. 252, 47s. Vol. [V: Biology and Problems of Health, pp. 179, 43s. 6d. 
Vol. v: Econometrics, Industrial Research and Psychometry, pp. 184, 43s. 6d. 


Journal space and the limited versatility of the reviewer would preclude anything like a critical 
appraisal of these books in which are printed all the papers presented at the third symposium held in 
1954/55 at the Statistical Laboratory of the University of California at Berkeley. I propose therefore 
to write a ‘contents’ review in order to show that there is something for everyone in the reported 
proceedings. 

Volume I. There are papers by: J. Berkson, ‘Estimation by Least Squares and by Maximum 
Likelihood’; Z. W. Birnbaum, ‘On a Use of the Mann-Whitney Statistic’; H. Chernoff and H. Rubin, 
‘The Estimation of the Location of a Discontinuity in Density’; A. Dvoretzky, ‘On Stochastic 
Approximation’; S. Ehrenfeld, ‘Complete Class Theorems in Experimental Design’; G. Elfving, 
‘Selection of Nonrepeatable Observations for Estimation’; U. Grenader and M. Rosenblatt, ‘Some 
Problems in Estimating the Spectrum of a Time Series’; J. L. Hodges & E. L. Lehmann, ‘Two Approxi- 
mations to the Robbins-Monro Process’; W. Hoeffding, ‘The Role of Assumptions in Statistical 
Decisions’; 8. Karlin, ‘Decision Theory for Pélya Type Distributions. Case of Two Actions’: L. le Cam, 
‘On the Asymptotic Theory of Estimation and Testing Hypotheses’; H. Robbins, ‘An Empirical Bayes 
Approach to Statistics’; M. Rosenblatt, ‘Some Regression Problems in Time Series Analy sis’; C. Stein, 
‘Efficient Nonparametric Testing and Estimation’ and ‘Inadmissibility of the Usual Estimator for the 
Mean of a Multivariate Normal Distribution’; B. L. van der Waerden, ‘The Computation of the 
X-distribution’. 

The majority of the papers are on technical points and in the mathematical style fami.:ar to readers 
of the Annals of Mathematical Statistics. The most interesting paper to the reviewer is that by Dr Berkson 
who with his commonsense approach and tenacity of purposes refuses to be blinded by mathematics 
and who in this paper raises once again the challenge in the problem. 

Volume II. There are papers by: D. Blackwell, ‘On a Class of Probability Spaces’; S. Bochner, 
‘Stationarity, Boundedness, Almost Periodicity of Random-Valued Functions’; K. L. Chung, ‘ Founda- 
tions of the Theory of Continuous Parameter Markov Chains’; A. H. Copeland, ‘Probabilities, Obser- 
vations and Predictions’; J. L. Doob, ‘Probability Methods Applied to the First Boundary Value 
Problem’; R. M. Fortet, ‘Random Distributions with an Application to Telephone Engineering’ ; 
J. M. Hammersley, ‘The Zeros of a Random Polynomial’; T. E. Harris, ‘The Existence of Certain 
Markov Processes’; K. Ité, ‘Isotropic Random Current’; P. Lévy, ‘A Special Problem of Brownian 
Motion, and a General Theory of Gaussian Random Functions’; M. Loéve, ‘Ranking Limit Problem’; 
E. Lukaes, ‘Characterization of Populations by Properties of Suitable Statistics’; K. Menger, ‘Random 
Variables from the Point of View of a General Theory of Variables’; E. Mourier, ‘Random Elements in 
Banach Spaces’; R. Salem and A. Zygmund, ‘Random Trigonometric Polynomials’. 

The papers in this volume appear to be written by probabilists for the benefit of other probabilists 
and most of them reflect the current interest of probabilists in the random process. There is no 
immediate statistical link-up—there is no reason why there should be—and applied statisticians 
generally will find this volume unrewarding. The theoretical statistician seeking to apply the random 
process may find a study of it useful. 

Volume III. This volume is divided into two sections, one for astronomy and one for physics. The 
contributions to astronomy consists of (i) Hertzsprung-Russell Diagram, with papers by: O. J. Eggen, 
‘The Relationship between the Color and the Luminosity of Stars near the Sun’; J. L. Greenstein, 
‘The Spectra and other Properties of Stars Lying Below the Normal Main Sequence’; H. L. Johnson, 
‘Photoelectric Studies of Stellar Magnitudes and Colors’; G. E. Kron, ‘Evidence for Sequences in the 
Color-Luminosity for M-Dwarfs’; B. Strémgren, ‘The Hertzsprung-Russell Diagram’ ; and of (ii) Spatial 
Distribution of Galaxies, with papers by: G. C. McVittie, ‘Galaxies, Statistics and Relativity’; 
J. Neyman, E. L. Scott and C. D. Shane, ‘Statistics of Images of Galaxies’; F. Zwicky, ‘Statistics of 
Clusters of Galaxies’. The contributions to physics are from: A. Blanc-Lapierre and A. Tortrat, 
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‘Statistical Mechanics and Probability Theory’; M. Kac, ‘Foundations of Kinetic Theory’; E. Montroll, 
‘Theory of the Vibration of Simple Cubic Lattices’; and N. Weiner, ‘Nonlinear Prediction and 
Dynamics’. 

Statisticians of all types will find this volume of interest and need not be deterred by a lack of 
knowiedge of astronomy. The papers are well and clearly written, and it appears possible from them to 
get some idea of work being done by statisticians in fields other than those we think of as orthodox 
(i.e. biology, economics, etc.) 

Volume IV. There are papers by: J. Crow and M. Kimura, ‘Some Genetic Problems in Natural 
Populations’; E. R. Dempster, ‘Some Genetic Problems in Controlled Populations’; J. Neyman, 
T. Park and E. L. Scott, ‘Struggle for Existence. The T'riboliwm Model’; M. S. Bartlett, ‘Deterministic 
and Stochastic Models.for Recurrent Epidemics’; A. T. Bharucha-Reid, ‘On the Stochastic Theory of 
Epidemics’; C. L. Chiang, J. L. Hodges and J. Yerushalmy, ‘Statistical Studies in Medical Diagnoses’; 
J. Cornfield, ‘A Statistical Problem Arising from Retrospective Studies’; D. G. Kendall, ‘Deterministic 
and Stochastic Epidemics in Closed Populations’; W. F. Taylor, ‘Problems in Contagion’. 

The level of these papers in this volume is high. They are written in a mathematical language that 
all statisticians should be able to understand, and since there is a concrete problem on which the 
mathematics is hung there is a sense of purpose about them which is often lacking in papers such as 
those, for example, in Volume I. The fields of application are the traditional biometric ones and in 
consequence this volume will be needed by many workers in the statistical field. 

Volume V. Under the subtitle ‘Contributions to Econometrics’ there are papers by: K. J. Arrow and 
L. Hurwicz, ‘Reduction of Constrained Maxima to Saddle-Point Problems’; E. W. Barankin, ‘An 
Objectivistic Theory of Probability’; C. W. Churchman, ‘Problems of Value Measurement for a Theory 
of Induction and Decisions’; P. Suppes, ‘The Role of Subjective Probability and Utility in Decision- 
Making’. Under the subtitle ‘Contributions to Industrial Research’ there are papers by: A. H. Bowker, 
Continuous Sampling Plans’; C. Daniel, ‘Fractional Replication in Industrial Research’; M. Sobel, 
‘Sequential Procedures for Selecting the Best Exponential Population’. Under the subtitle ‘Contribu- 
tions to Psychometry’ there are papers by: T. W. Anderson and H. Rubin, ‘Statistical Inference in 
Factor Analysis’; F. Mosteller, ‘Stochastic Learning Models’; and H. Solomon, ‘Item Analysis and 


. . rn . ’ 
Classification Techniques’. F. N. DAVID 


Fejezetek a klasszikus valésziniiségszamitasb6l. (Chapters on the classical probability 
calculus.) By Jorpan KAroty (Charles Jordan). Budapest: Akadémiai Kiado. 
1956. Pp. 616. 120 forint. 


This book presents the theory of probability as the famous Hungarian statistician sees it after 50 years 
work and 30 years lecturing. Big changes have taken place during that time; but as the new point of 
view is set out in a book by Rényi, the author here concentrates on the classical methods and especially 
their practical applications. The book begins with introductory and historical remarks, definitions and 
a long chapter on mathematical techniques. It then goes on to arithmetic and geometric expectations, 
the binomial, Poisson and Lexis distributions in one and two variables, applications to games, gambler’s 
ruin, geometric probabilities, the theory of errors and least squares, and the kinetic theory of gases. The 
book is beautifully printed and well indexed, and includes a short table of binomial coefficients. Hun- 
garians are very lucky to have such a volume in their language, although unfortunately this means that 
other nationalities will be effectively deprived of the use of the book. gppric a. B. AND PIRI SMITH 


Mathematische Statistik (Band LXX XVII of the Grundlehren der Mathematische 
Wissenschaften series). By B. L. VAN DER WAERDEN. Berlin: Springer Verlag. 1957. 
Pp. 360. D.M. 49.60 (£4. 9s.). 


It is interesting to compare this book with another of the same title by Schmetterer, written earlier in 
1956, and published by Springer in Vienna (see Biometrika, 44, 293). Van der Waerden achieves 
a wider range of topics in a shorter space by adopting a more informal and less systematic approach 
than Schmetterer and by going less deeply into the general theory. Thus, although he is careful to 
preserve mathematical rigour, the author treats the commoner, simpler and more frequently applied 
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tests and estimations problems. He opens with a simple presentation of Kolmogoroff’s axioms and 
follows with an examination of frequency distributions and the information contained in their means and 
variances. He then treats Kolmogoroff’s K and allied measures and develops the mathematical theory 
of transformation of variates and characteristic functions. Next comes the theory of distributions 
derived from the normal followed by two short chapters on estimation by least squares and maximum 
likelihood and minimum x?. A chapter on bioassay intervenes before consideration of simpler tests and 
the bare outlines of the Neyman—Pearson theory. The last two chapters are concerned with Wilcoxon’s 
and allied tests and with correlation, both product moment and by rank methods. 

The book is beautifully produced, as indeed it should be at its formidable price. D. E. BARTON 


Statistical Methods in Research and Production with special reference to the 
Chemical Industry (third edition revised). Edited by O. L. Davizs. Edinburgh: 
Oliver and Boyd Ltd. 1957. Pp. 396. 45s. 


This book is the third edition of one first published in 1947, and runs to 390 pages against 286, while of 
eight authors listed five names are common to the seven listed in the first edition. (The reviewer has 
not seen the second edition). The potential reader should not be put off by the expression ‘with special 
reference to the chemical industry’ if that does not cover his own field, as all this means is that the 
examples are mostly chosen from chemical subjects. 

The book consists mainly of a practical exposition of the logic and application of the more elementary 
statistical methods. Personally I would consider that the treatment is weighted somewhat too much 
on the application side, since in my experience amateur statisticians rarely fail to apply methods cor- 
rectly, but are often weak in the logic of what they are doing. In any statistical method there is quite 
a narrow range of readers between those who know it already and those who are unwilling to make the 
effort to learn it, and everyone tends to have different ideas on what the typical reader should be assumed 
to know, but I feel that actual readers will mostly be less logical and more arithmetical than is apparently 
assumed by the authors. 

The methods of exposition and the notation are admirably clear. Trying out on myself a section 
describing a method which I have not used I had no difficulty at all in understanding and applying it. 

The only part of the book I disagree with to any serious extent is the statement on page 1 that 
‘statistics may be defined as the study of chance variations’. It is surely truer to say that statistics is 
the study of how to draw valid conclusions from numerical data in spite of interference from chance 
variations. 

The scatter diagram on page 191, said to represent a low correlation, in fact corresponds to one of 
approximately 0-584. 

The book may be strongly recommended to anyone, even with very little mathematical knowledge, 
wishing to apply statistical methods to almost any form of industrial research or production. 


L. MCMULLEN 


Statistical Methods in Quality Control. By D. J. Cowpren. New York: Prentice- 
Hall Inc. 1957. Pp. xxiv+ 727. $12.00. 


This book is extremely comprehensive although as a result it is unfortunately very long. It covers not 
only all the common methods of quality control, but aiso gives the basic theory of statistical methods 
required going somewhat further in this direction than would be needed in order to carry out just the 
basic techniques in quality control. 

The first eleven chapters deal with statistical theory starting with measures of location, dispersion 
and shape, following with the elements of probability theory, exemplified by the binomial theorem, 
and leading up to the normal curve. This is followed by methods of estimation, tests of hypotheses and 
confidence limits together with simple analyses of variance and tests for the homogeneity of variances. 
The remaining twenty-nine chapters deal with quality control, interspersed with some chapters on 
portions of statistical theory, such as frequency curves, the hypergeometric distribution and regression 
techniques. Single, double and multiple inspection schemes are covered and the concepts of the 
operating characteristic and average outgoing quality are dealt with fully. There is a very interesting 
chapter on the economics of control charts, although it is a pity that the discussion is restricted to the 
case where the inspection scheme is based on single samples. The book does not deal in any detail with 
continuous inspection schemes although Dodge’s scheme where sampling is reduced until a defective 
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appears is mentioned. The schemes based on the combination of small samples, such as the deferred 
sentencing schemes, or the schemes '.sing both warning and outer limits in combination, suggested by 
B. P. Dudding and W. J. Jennett, and further investigated by E. 8. Page, are not discussed. The 
exposition throughout is very clear and all the explanations given are extremely comprehensive. It is 
refreshing to find that the author faces up to such problems as non-normality and deals with them 
realistically. 

There are twenty-seven appendices containing useful tables for many of the tests and techniques 
discussed in the book including such distributions as the studentized range, W. J. Dixon’s ratios for 
testing extreme values, the circular autocorrelation coefficient with lag 1 and certain percentage points 
of the Gram—Charlier curve. 

This book would probably make a difficult course in statistics for a student to work through on his 
own because it is very hard to see the wood from the trees. The student would also be left in some 
difficulty as to which of the many techniques available he should apply in any given practical problem. 
The volume would, however, be very useful as a textbook for a course on quality control when the 
teacher could sign-post the way and use much of the material in the book for illustration and consolida- 


tion. As a refresher and reference work, too, the book should have an assured place. P. G. MOORE 


Nonparametric Statistics for the Behavioral Sciences. By 8. Stmeex. London: 
McGraw-Hill Publishing Co. Ltd. 1956. Pp. 313. 49s. 


It is stated that this book will be of ‘special interest to practising behavioral scientists, practising 
statisticians, and to somewhat advanced students of these fields’. Actually as one of the McGraw-Hill 
Series in Psychology it is undoubtedly aimed at psychologists and very useful it should prove to them. 
What Mr Siegel has done is to collect together a number of (so-called) non-parametric tests which are 
either in common use or are obviously of use in the analysis of psychological data and to divide them 
up into approximately appropriate compartments. Thus, we are given in the one sample case, the 
binomial test, y?, Kolmogoroff—Smirnoff, and a runs test. For two samples we have Wilcoxon, Wald, 
Randomization and the various tests for a 2 x 2 table. The several procedures are quite well set out and 
are illustrated by numerical examples. There is no real attempt at mathematical derivations. 

Neyman and Pearson in introducing their extremely useful concept of a power function, made possible 
the development of a number of new ideas in what, for the sake of distinction, I might call parametric 
statistics. But while in the mathematical development it is desirable to represent the hypothesis 
alternate to that tested by some sort of functional form, in practice one is usually without any clear idea 
of what the alternative hypothesis may be, and one is lucky if it is possible to assign it to a broad 
class. Non-parametric statistical tests are merely tests for randomness and the ways in which data can 
depart from randomness are infinite. It is only possible therefore to specify very broadly the form in 
which deviations from randomness can take in order to help with the choice of a critical region for the 
test statistic and to decide which test statistic would be the more appropriate. A few general principles 
can, however, be adopted if one uses commonsense. For example, a criterion based on ranks will 
usually be more sensitive than a criterion based on the grouping of observations (such as runs), provided 
like is compared with like. 

Mr Siegel does not really come to grips with this business of the comparison of tests. He uses a con- 
cept called power-efficiency which lends a spurious depth to some of his discussions without really 
contributing a great deal to one’s ideas about the situations suitable for the application of the tests 
which he proposes. This is a pity because, as the reviewer has tried to indicate, the application of non- 
parametric tests must of necessity be based on commonsense rather than on the mathematical develop- 
ment of the alternate hypothesis. His book cannot therefore be recommended as other than a useful 


summary of some parametric tests in everyday use. F. N. DAVID 


Scientific Inference (second edition). By Str Harotp Jerrreys. London: Cambridge 
University Press. 1957. Pp. 236. 25s. 


The first edition of this book was published in 1931, re-issued with addenda in 1937, and now is pre- 
sented largely re-written and developed. It was reviewed in this journal in 1932. To those familiar with 
Sir Harold’s book on probability and with the first edition of this book, the present edition holds no 
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surprises ; the author takes the point of view which we have come to expect. But whether we agree or 
not with the ideas put forward on probability (direct and inverse), or sampling, or significance, to 
mention a few among many topics, the writing is stimulating and a challenge to the reader. It is books 
such as these which are the leaven in the dough of academic textbooks. 


Introduction to Statistical Analysis (second edition). By Witrrep J. Drxon and 
J. Massey Jr. London: McGraw-Hill Publishing Co. Ltd. 1957. Pp. 481. 45s. 


This is the second edition of a textbook which was reviewed by John Wishart in 1952 (Biometrika, 
Vol. 39). There are some changes and additions. The chapters on * Various Measures of Central Value 
and Dispersion’, on ‘Statistical Inference’ and on ‘Analysis of Variance’ have been re-written and 
enlarged. A small amount of material has been interpolated in other chapters, and a completely new 
chapter on probability has been added at the very end. The quantity of statistical tables given in the 
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